NOAA-CSL / MELODIES-MONET

MELODIES MONET - diagnostic tool for evaluating models against a variety of observations including surface, aircraft, and satellite data all within a common framework
https://melodies-monet.readthedocs.io
Apache License 2.0
21 stars 31 forks source link

Add OpenAQ to MELODIES MONET #157

Open rschwant opened 1 year ago

rschwant commented 1 year ago

Zach is working on adding the latest version of OpenAQ to MONETIO. Once this is complete, let's add a converter for OpenAQ data into MELODIES MONET too. Zach this could just be part of your CLI tool. Barry mentioned we should be able to use the same format for the AirNow files, so it should be pretty seamless to pull into the tool.

Then as part of this, let's create an example where we show users how to use OpenAQ data. Specifically how to use all the data and how to filter to just include only monitor data or only sensor data. Then let's also include a description of this on the ReadTheDocs page so that people understand that not all data from OpenAQ is monitor data, but there is an easy method to filter to select the appropriate measurement technique based on your science question.

zmoon commented 1 year ago

See also https://github.com/noaa-oar-arl/monetio/issues/59

zmoon commented 1 year ago

filter to just include only monitor data or only sensor data

Not sure how to do this with the OpenAQ data source we currently use for MONETIO. @bbakernoaa @rschwant ?

bbakernoaa commented 1 year ago

Can we not just do df[[columns_to_keep]]?

zmoon commented 1 year ago

Can we not just do df[[columns_to_keep]]?

Sure but I don't know how to determine which to keep. That is, I don't see a column that tells us whether the measurement comes from a "monitor" or just a normal sensor. There is the "sourceType" column, but that seems to be always just set to "government".

I guess the original JSON files do have more info that we aren't currently propagating through in our processing [^1]. Do you know which we need? Example entry:

{"date":{"utc":"2019-07-31T16:00:00.000Z","local":"2019-07-31T22:00:00+06:00"},
"parameter":"pm25","value":5,"unit":"µg/m³",
"averagingPeriod":{"value":1,"unit":"hours"},
"location":"US Diplomatic Post: Astana",
"city":"Astana","country":"KZ",
"coordinates":{"latitude":51.125286,"longitude":71.46722},
"attribution":[{"name":"EPA AirNow DOS","url":"http://airnow.gov/index.cfm?action=airnow.global_summary"}],
"sourceName":"StateAir_Astana",
"sourceType":"government",
"mobile":false}

[^1]: Currently monetio returns a df with ['time', 'latitude', 'longitude', 'sourceName', 'sourceType', 'city', 'country', 'utcoffset', 'bc_umg3', 'co_ppm', 'no2_ppm', 'o3_ppm', 'pm10_ugm3', 'pm25_ugm3', 'so2_ppm', 'siteid', 'time_local']

zmoon commented 1 year ago

Update: as noted in last dev meeting, our current OpenAQ reader in MONETIO only fetches OpenAQ v1 data, which doesn't include the low-cost sensors. But I am working on an OpenAQ v2 reader, which does.