earthobservations / luftdatenpumpe

Acquire and process live and historical air quality data without efforts. Filter by station-id, sensor-id and sensor-type, apply reverse geocoding, store into time-series and RDBMS databases, publish to MQTT, output as JSON, or visualize in Grafana. Data sources: Sensor.Community (luftdaten.info), IRCELINE, and OpenAQ.
https://luftdatenpumpe.readthedocs.io/
GNU Affero General Public License v3.0
35 stars 3 forks source link

[OpenAQ] Ingest data from the OpenAQ API #19

Open amotl opened 4 years ago

amotl commented 4 years ago

Introduction

OpenAQ's mission is to fight air inequality by opening up air quality data and connecting a diverse global, grassroots community of individuals and organizations.

Thoughts

We might think about integrating the Python wrapper for the Open AQ API in order to get maximum worldwide coverage without significant efforts. This could even make #12 obsolete.

See also Using the OpenAQ API to acquire open air quality information from Python.

amotl commented 4 years ago

News

By 15b117ab and 84041a5b and starting from version 0.20.1, Luftdatenpumpe is now able to ingest data from the OpenAQ API. Right now, only "recent" data is acquired using the OpenAQ measurements API. As a date_from parameter, we use the last full hour at xx:00 until now, i.e. hourdate(utcnow() - 1h). However, we are not sure about this strategy yet.

Examples

It is recommended to apply a country filter in order to reduce the amount of data per invocation.

# Acquire data from EEA Germany
luftdatenpumpe readings --network=openaq --progress --reverse-geocode --country=DE

# Acquire data from EEA Belgium
luftdatenpumpe readings --network=openaq --progress --reverse-geocode --country=BE

# Acquire data from GIOS network in Poland
luftdatenpumpe readings --network=openaq --progress --reverse-geocode --country=PL

# Acquire data from AirNow network in the U.S.
luftdatenpumpe readings --network=openaq --progress --reverse-geocode --country=US

Backlog

cc @wetterfrosch

amotl commented 4 years ago

It looks like not all stations report at the same time and interval.

While BE seems to report at T01:00:00 or T02:00:00, DE always reports at T00:00:00. On the other hand, NL reports each hour.

So, we will have to tune the date_from parameter when invoking the api.measurements() API call.

In general, the response of the OpenAQ sources API (sources) informs about the corresponding resolutions. While all undesignated items (resolution: null) might yield a daily resolution, some are offering data in either 1 hr, 15 min or 10 min (AU). So, we will have to use that information to compute the date_from parameter correctly in order to safely retrieve the latest data of the respective country.

amotl commented 1 year ago

Other than resolving the details enumerated within this discussion, we may want to look at OpenAQ API Version 2 as well.