openaq / openaq-fetch

A tool to collect data for OpenAQ platform.
MIT License
84 stars 39 forks source link

Scotland AQ Data #666

Open sruti opened 4 years ago

sruti commented 4 years ago

No API but seems fairly straightforward to get hourly-averaged data (or every 15 min in some cases) across 96 sites from here: http://www.scottishairquality.scot/data/data-selector

An ftp server is mentioned here but can't find the link.

Coordinates can be found here

dalipkumar703 commented 4 years ago

Is data getting download locally ?

sruti commented 4 years ago

@dalipkumar703 what do you mean?

dalipkumar703 commented 4 years ago

@sruti I mean do we have function or api call in code to get the data from here. Or I can add new one?

dalipkumar703 commented 4 years ago

Or if you know other open issue where I can work that would be really helpful.

sruti commented 4 years ago

Hi @dalipkumar703! We aren't pulling in this data yet, so please go ahead and write an adapter if you'd like to. Examples can be found in the adapters folder.

Also, feel free to join our Slack group: https://openaq-slackin.herokuapp.com/. We're having office hours there tomorrow (at 1pm EST) and we can discuss further a good issue to start contributing!

dalipkumar703 commented 4 years ago

Thanks i will start contributing and join slack too.

On Fri, 22 May 2020, 08:25 sruti, notifications@github.com wrote:

Hi @dalipkumar703 https://github.com/dalipkumar703! We aren't pulling in this data yet, so please go ahead and write an adapter if you'd like to. Examples can be found in the adapters folder https://github.com/openaq/openaq-fetch/tree/develop/adapters.

Also, feel free join our Slack group: https://openaq-slackin.herokuapp.com/. We're having office hours there tomorrow (at 1pm EST) and we can discuss further a good issue to start contributing!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openaq/openaq-fetch/issues/666#issuecomment-632450105, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC23I6FRQ57XRG6WOQXJ6CLRSXSQLANCNFSM4KGJZLAQ .

chriswait commented 4 years ago

TLDR: I don't think the FTP server exists.

From http://www.scottishairquality.scot/data/

There are three options for the output of your database download, according to the size of the data request. . Small enquiries can be shown on your screen using minimum HTML, . moderately sized enquiries can be e-mailed to you as an attachment in comma separated format, or . the largest enquiries will be left on an ftp site, also in comma separated format, for you to collect.

I've played around with the "Data Search" functionality here. If you create a search for all parameters & sites for today's date, in the City of Edinburgh region, (whether you provide an email address or not) you get the following message:

The selection you requested will return 120 rows of data in 308 columns. This exceeds the limit of 256 columns. you should choose either less pollutants or fewer sites in your selection.

No mention of the FTP server being available for download...

So I reached out to Geoff Broughton (http://aqdm.co.uk/Geoff%20Broughton.html), who I believe may have been the original developer of the site, and he had this to say:

Thank you for your message. The simplest way to download data is to use the data selector web page. http://www.scottishairquality.scot/data/data-selector This works very well and can be used to download the entire database in large chunks. There is the “Atom Download Service” but this is difficult. There is no FTP access as far as I am aware. Best regards Geoff

I looked into the Atom Download Service briefly, and asked Geoff for more details, but it seems like a dead end.

I think the adapter might need to automate the process of interacting with the Data Search tool, and ensure that result sets contain no more than 256 columns. Then it can scrape the resulting HTML tables:

image

chriswait commented 4 years ago

More ~good~ ~bad~ news from Geoff:

The Atom feed can be automated. I download the data from all stations every hour. The Atom feed was designed by EU IT experts. The overall data model is ridiculously complicated. An API would be far too useful for you and me. You still need to do lots of small queries. The website will not allow users to download large datasets every hour.

Sounds like the Atom Feed might be a bigger undertaking, and it's probably worth having a go at automating interaction with the data search functionality first. But if I find any details on the Atom feed I'll add here.

chriswait commented 4 years ago

@sruti I just noticed that OpenAQ already appears to have data for Edinburgh: https://openaq.org/#/location/Edinburgh%20St%20Leonards?_k=04843s

This is implemented in the defra adapter which scrapes its data from here

Just so I can confirm the result I'm aiming for, what's the reason openaq wants another adapter for scottishairquality.scot? Is it just the frequency of data points? Adding more regions/locations?

sruti commented 4 years ago

DEFRA doesn't capture all of the stations outputting air quality data in Scotland, even in Edinburgh, so hoping to add those in. I think there might be ~10 stations that overlap with the two sources so it would be good to double check and maybe filter those out.

majesticio commented 1 year ago

They have updated their website https://www.scottishairquality.scot/data/data-selector . We need to review this source and compare it to the DEFRA adapter for duplicated stations