maelle / cpcb

:mask: Scraping India CPCB air quality data :mask:
7 stars 2 forks source link

Historical data availability #2

Open maelle opened 8 years ago

maelle commented 8 years ago

I cannot use the upper part of http://www.cpcb.gov.in/CAAQM/Auth/frmViewReportNew.aspx because it's not parameter specific so you could have data at the station from a given date for a given parameter and not another. E.g., PM2.5.

The bottom part is by channel=parameter. http://www.cpcb.gov.in/CAAQM/Auth/frmViewReportChannelWise.aspx?state=Bihar&city=Patna&station=IGSC%20Planetarium%20Complex&channel=PM2.5 for instance. All the tables seem to have a 1899 line...

Another solution for finding availability would be to first load tables from http://www.cpcb.gov.in/CAAQM/frmReportdisplay.aspx for daily averages and find the first day where it's not empty.

maelle commented 8 years ago

For Hyderabad for instance for PM2.5 there are two locations. The earliest data point is for April 2015. So there doesn't seem to be many years data on this website @jflasher (but more than current OpenAQ data, so it's still probably worth to have a look)

jflasher commented 8 years ago

More data is always better, but if it only goes back to April 2015, may not be worth a huge effort if it turns out to be very difficult.

maelle commented 8 years ago

@jflasher I don't think it'll be that difficult, I just need space for all the data once I start downloading it. :smile: I agree that it's not much data but since there isn't much data about India it could already make a difference in some cases (e.g. having more than one year data & thus look at seasonality). I'll keep you posted