unlanza / air-quality-tests

Working with air quality datasets from many sources. Tools: Excel, .NET 8, Visual Studio
0 stars 0 forks source link

air-quality-tests

Working with air quality datasets from many sources. Tools: Excel, .NET 8, Visual Studio

About the dataset

Sources

Official

External providers

Noticias

Data characteristics

Ficha de contaminante - CO. Pasted image 20241016025406.png

Ficha de contaminante - NO2. Pasted image 20241016025633.png

Ficha de contaminante - PM10. Pasted image 20241016025707.png

Fuentes recolectoras de datos. Pasted image 20241016025529.png

Transforming data

Analysis approach

Combination Approach Use Case Benefits
Clustering + Decision Trees Cluster by pollutant levels across locations, apply decision trees to predict air quality. Group air quality patterns by time/location and predict poor air quality. Reveals patterns and creates predictive rules for specific clusters.
Decision Trees + Association Rules Use decision trees to identify key pollutants, apply association rules to find pollutant co-occurrences. Identify key pollutants driving poor air quality, and find co-occurrence patterns. Uncovers important pollutants and frequent co-occurrences.
Clustering + Association Rules Cluster based on pollutant levels at different locations, then apply association rules within each cluster. Understand pollutant combinations that spike together across locations. Targeted pattern discovery for specific clusters of air quality data.
Clustering + Decision Trees + Association Rules Cluster data, use decision trees to predict air quality, and mine association rules within clusters. Comprehensive air quality analysis across time and locations. Multi-faceted approach combining segmentation, prediction, and patterns.
Association Rules Informed by Decision Trees Use decision trees to find key pollutants, then apply association rules on these pollutants. Focus on critical pollutants and discover related pollutant patterns. Reduces complexity by focusing on the most important features.

Architecture Records

1. On DB Injection

Context

Based on this file for a current sample: aspire-samples/samples/AspireShop/AspireShop.BasketService/AspireShop.BasketService.csproj at main · dotnet/aspire-samples (github.com)

Decision

I will inject the DB from the AirQualityApp.ApiService project

2. Processing Application

Context

Based on experience, the main objective of the app will be exposing through an API/web frontend, the consumed/processed dataset

Decision

The architecture will be like:

3. CSV Reading approach

Context

My options are:

  1. Manual parsing.
  2. JoshClose/CsvHelper NuGet Package.

Decision

We are using this NuGet package: JoshClose/CsvHelper: Library to help reading and writing CSV files (github.com) CsvHelper Getting started