openaq / openaq-data-format

A description of the data format provided by the OpenAQ platform.
MIT License
30 stars 4 forks source link

Another nuance to describing averaging + reporting interval: a request to describe how the field is averaged #9

Open RocketD0g opened 8 years ago

RocketD0g commented 8 years ago

This is of lower priority than first addressing issues brought up in: #4 (e.g. being clear on averaging period and reporting frequency), but an interesting point brought up by Multitude (here):

They'd like a "field that tells how the time is averaged. Different organizations average differently. For example, is the one hour average centered on the timestamp, or forward or backward looking (e.g. forward looking would have 12:00 represent data averaged between 12:00 and 12:59). Not sure if you have this info, or would be willing to go through and contact your various sources to find out. For us, it's crucial when cross-comparing different data."

Logistically, I don't see this happening for us for all sources any time soon. We may be able to determine this for EPA and EEA, but think it will be very hit or miss elsewhere.

olafveerman commented 8 years ago

I believe we are backward looking across the adapters. See for example the Australian adapter

On the wiki of the openaq-api we have some guidelines: https://github.com/openaq/openaq-api/wiki/4.-Writing-an-adapter#dealing-with-dates-and-date-ranges

Should we revise these and make them part of CONTRIBUTING.md of fetch?

RocketD0g commented 8 years ago

To be honest, I had forgotten we'd made that protocol, @olafveerman, heh. Though this is slightly different, I think - It's for when a place just puts a single timestamp down for a measurement, and we're not sure if they forward, backward, or in the middle calculate it.

For example, we see the timestamp 3:30pm, and we're not sure if they mean this was the average between 3-4, 3:30-4:30, or 2:30-3:30pm because they don't list the start/stop averaging times. As you point out (and I'd forgotten), if we know the data was taken between 3 and 4, we have a protocol, but we don't always know that. I'm going to amend the response to the Multitude folks to include we have this protocol in place though. Thanks, @olafveerman .