Closed mjstealey closed 6 years ago
Identified four specific bias/error metrics that we will compute for O3, NO2, SO2, CO, PM2.5 (and chemical components) for various observation networks in the country, and report domain-average hourly metrics for 2010 and 2011.
Proposal:
value
of cmaq_output
for variables that have a data quality metric associated to them.
"data_quality": {}
Example o3
output:
{
"values": [
{
"variable": "o3",
"lat_lon": "35,-80",
"cmaq_output": [
{
"date_time": "2010-01-01 00:00:00",
"value": 49.8195953369141,
"data_quality": {
"obs_average": 15.7729,
"model_average": 27.5666,
"bias_average": 11.7936,
"rmse_average": 18.6757,
"corr_average": 0.313
}
}
]
}
]
}
Will add a data_quality
boolean flag to the exposure_list
table:
Table exposure_list
:
- id,
- variable,
- description,
- units,
- common_name,
- utc_min_date_time,
- utc_max_date_time,
- resolution,
- aggregation,
- data_quality
And then expand the data_quality
table to include the columns for all variable metrics.
Table data_quality
:
- id,
- utc_date_time,
- o3_model_average,
- o3_obs_average,
- o3_corr_average,
- o3_rmse_average,
- o3_bias_average,
... (and so on)
The data_quality stanza from above will then be injected into the return value based on the boolean from exposure_list on query.
Code has been updated and merged into master.
Quality metrics are now made available in the /variables
endpoint if they exist. Example for o3:
{
"aggregation": [
"max",
"avg"
],
"common_name": null,
"description": "1000.0*O3[1]",
"end_date": "2011-02-01 01:00:00",
"has_quality_metric": true,
"quality_metric": [
{
"common_name": "root mean squared error",
"variable": "rmse"
},
{
"common_name": "mean observed value ",
"variable": "obs_mean"
},
{
"common_name": "mean modeled value",
"variable": "mod_mean"
},
{
"common_name": "mean bias",
"variable": "mb"
},
{
"common_name": "pearson correlation coefficient",
"variable": "cor"
}
],
"resolution": [
"hour",
"day",
"7day",
"14day"
],
"start_date": "2010-12-01 00:00:00",
"units": "ppbV",
"variable": "o3"
}
If a quality metric exists for a queried variable, it will be returned as part of of the /values
output.
{
"values": [
{
"cmaq_output": [
{
"date_time": "2010-12-30 05:00:00",
"quality_metric": {
"cor": 0.426,
"mb": 11.4464,
"mod_mean": 35.0096,
"obs_mean": 23.5632,
"rmse": 12.9588
},
"value": 30.7911243438721
},
{
"date_time": "2010-12-31 05:00:00",
"quality_metric": {
"cor": 0.465,
"mb": 10.4739,
"mod_mean": 30.5484,
"obs_mean": 20.0744,
"rmse": 12.359
},
"value": 36.8378486633301
},
{
"date_time": "2011-01-01 05:00:00",
"quality_metric": {
"cor": 0.577,
"mb": -0.70803,
"mod_mean": 22.361,
"obs_mean": 23.069,
"rmse": 7.62782
},
"value": 21.735221862793
},
{
"date_time": "2011-01-02 05:00:00",
"quality_metric": {
"cor": 0.633,
"mb": -1.15105,
"mod_mean": 21.2295,
"obs_mean": 22.3806,
"rmse": 7.97999
},
"value": 18.5885848999023
},
{
"date_time": "2011-01-03 05:00:00",
"quality_metric": {
"cor": 0.438,
"mb": -0.191461,
"mod_mean": 19.4314,
"obs_mean": 19.6229,
"rmse": 8.51403
},
"value": 23.6635990142822
}
],
"lat_lon": "35,-80",
"variable": "o3"
}
]
}
@arunacs - I don't fully know what this will look like yet, but wanted a placeholder issue.