Closed FabioMD1972 closed 7 years ago
Thanks, @FabioMD1972! 👍
the URL for the hourly data for the whole Latio region (not only Rome) can be built as follows:
http://www.arpalazio.net/main/aria/sci/annoincorso/chimici/<province>/DatiOrari/<province>_<pollutant>_<year>.txt
with provinces: CC,FR,LT,RI,RM,VT
and pollutants: BENZENE,CO,NO2,NOX,NO,O3,PM10,PM2.5,SO2
the TXT file is a table with columns: julian day, hour and one column for each station
stations' metadata: https://github.com/jobonaf/calicantus/blob/master/data/sites-info/metadata.ARPA-Lazio.csv
I'm having issues accessing the hourly data via a link like:
http://www.arpalazio.net/main/aria/sci/annoincorso/chimici/%3Cprovince%3E/DatiOrari/RM_SO2_2017.txt
Thanks for any thoughts on where I am going wrong, @jobonaf!
You should substitute <province>
with RM
not only in the file name, but also in the path, like this: http://www.arpalazio.net/main/aria/sci/annoincorso/chimici/RM/DatiOrari/RM_SO2_2017.txt
Ah yup, oops - Thanks! Now labelling ready for dev
.
Thank you, @FabioMD1972 and @jobonaf!
Also, the PDF linked from the page has detailed information on the file format. Link to Google Translated version.
stations' metadata: https://github.com/jobonaf/calicantus/blob/master/data/sites-info/metadata.ARPA-Lazio.csv
@jobonaf This is very helpful. Can I ask how you compiled this data?
@RocketD0g Some concerns about the data:
-999
values. Example: Viterbo SO2. Can I just skip them?jd
is day number, h
is hour, and other columns are individual stations):
jd h 32 59 90
1 1 35.0 22.0 59.0
1 2 35.0 22.0 59.0
1 3 35.0 22.0 59.0
1 4 35.0 22.0 59.0
1 5 35.0 22.0 59.0
...
1 21 35.0 22.0 59.0
1 22 35.0 22.0 59.0
1 23 35.0 22.0 59.0
1 24 35.0 22.0 59.0
Really good observations, @dolugen.
If a station for the easily discernible past has not reported any data (e..g -999) ever for a particular pollutant, my thought is that we should skip adding in that station. If it is a mix of some reported data and lots of -999s for a particular station+pollutant, then I think it is still a fine station to add in. Do you think that is a sensible plan, @dolugen?
@jobonaf another question - Is there a place online where we can cite that these data are said to be hourly - or perhaps daily in the case of @dolugen's observation? It definitely seems in his example that those are daily values being reported versus hourly, but we would want to state the time-averaging interval that the source agency says the data are in. (And if the time interval looks to be mismatched with what the source agency says, we'd want to interact with the source agency for clarification on the apparent discrepancy.)
If a station for the easily discernible past has not reported any data (e..g -999) ever for a particular pollutant, my thought is that we should skip adding in that station. If it is a mix of some reported data and lots of -999s for a particular station+pollutant, then I think it is still a fine station to add in.
I think that's reasonable, I'll go with that.
@dolugen @RocketD0g I collected the metadata for a project involving some italian regional environmental agencies (https://sdati.arpae.it/calicantus-intro/). Many (but not all!) stations in Italy measure PM10 and PM2.5 on a daily basis. If you need more info about data collected by ArpaLazio, you could contact the regional center for air quality (e-mail here http://www.arpalazio.net/main/aria/sci/basedati/bollettini/2017/BA282017.pdf).
Thanks a bunch, @jobonaf - just shot them an email.
@jobonaf I've found that the metadata.ARPA-Lazio.csv
is missing data for several stations. Is it possible to update it? Here are the station IDs that are missing:
86
87
101
102
103
104
105
106
107
108
110
111
IDs 86, 87 are from Rome, and IDs > 100 are from Civitavecchia, as described in the PDF.
Thank you @dolugen , I will update the file. Consider also this page: http://www.arpalazio.net/main/aria/doc/RQA/locRQA.php
metadata.ARPA-Lazio.csv
updated
Consider also this page: http://www.arpalazio.net/main/aria/doc/RQA/locRQA.php
metadata.ARPA-Lazio.csv
updated
Thank you!
@RocketD0g So, I'm basically done with the adapter. Do I wait for the email reply about the hourly values of PMs? If they confirm it's daily averages, I'll change the adapter to save just the first daily value for PMs.
@dolugen you could also save directly the daily averages from here:
http://www.arpalazio.net/main/aria/sci/annoincorso/chimici/RM/MedieGiornaliere/RM_PM10_2017_gg.txt
More generally http://www.arpalazio.net/main/aria/sci/annoincorso/chimici/<province>/MedieGiornaliere/<province>_<pollutant>_2017_gg.txt
with provinces: CC,FR,LT,RI,RM,VT
and pollutants: PM10,PM2.5
@dolugen - Our track record for receiving back messages to questions like that tends to be somewhat poor, so perhaps go with the daily values @jobonaf points to for the PM data?
@jobonaf ARPALAZIO data is now on OpenAQ! You've been a great help, thank you! And @FabioMD1972 too, thanks for suggesting the source!
http://www.arpalazio.net/main/aria/sci/annoincorso/chimici.php
This site is only in italian language, I can help to translalate it.