openaq / battuta

Reverse geocoding for air quality stations
MIT License
2 stars 0 forks source link

missing stations #4

Open maxgrossman opened 7 years ago

maxgrossman commented 7 years ago

When running the direct-eea adapter I noticed that not all results had a station as seen below

2017-08-25T18:23:05.998Z - info: ///////
2017-08-25T18:23:05.998Z - info: [Dry Run] New measurements inserted for EEA Andora: 223
2017-08-25T18:23:05.998Z - info: 74 occurrences of instance.city does not meet minimum length of 1
2017-08-25T18:23:05.998Z - info: ///////
2017-08-25T18:23:05.998Z - info: Dryrun completed, have a good day!

I looked through that code and it doesn't appear that the problem is coming from the adapter itself, but rather the data generated here. After I checked # of unique stations in the metadata file and the # of stations in our station-locations.json I noticed we're off from the metadata # of stations by about 300 stations. I think this is generative of the problem.

maxgrossman commented 7 years ago

an update here. Missing stations were a result partly of the unique station id / unique coordinate inconsistency mentioned in #5 and #6, but also might have to do with trying to rely on the metadata file.

When running each EEA country adapter for discussions here I came across a station in Sweden that existed in the source file but not in the most recent metadata file...perhaps we need to have the eea adapter talk to batutta?

We can flag these cases in the actual adapter and write them as a list in our s3 bucket? We can then add a cron job that does the same sort of matching/updating the current one does, but instead, it just reverse geocodes the stations in that file, combines them with the existing eea-stations.json, and pushes an updated JSON back to s3.

@olafveerman @jflasher

olafveerman commented 7 years ago

@maxgrossman How many stations are missing in the metadata file?

If there are only a couple, we should throw a nice error and rely on EEA to add them to the metadata file.

maxgrossman commented 7 years ago

There are 14 instances.