Open KarenJewell opened 1 year ago
Hey all. This looks like an interesting one. Is it possible if I can work on this?
Awesome yeah go for it! I'd recommend self-assigning the ticket so no one else picks it up also.
Oops apparently you're not able to assign so I've just done that now - all yours! Shout if you need anything.
Thanks Karen! If possible, could you add more info in the issue? Like which datasets are needed :)
Hey. I had a question regarding this. :)
I am in the process of thinking of ways to do this.
For a link such as this https://www.nrscotland.gov.uk/statistics-and-data/statistics/stats-at-a-glance/registrar-generals-annual-review/2021
There are much more data present in the Excel sheet given here. So would you like to have them all in different CSVs and placed in the folder the_od_bods/data/scraped_results
? Or keep it as an excel sheet?
Sorry I'm not sure I understand the question rightly, but we're not scraping the dataset, only the dataset metadata. Please see merged_output.csv for what the end aggregated goal is. The intermediary structure doesn't matter as long as it can append on to merged_output.
Maybe see /web-scrapers for examples of other web scrapers we use and the wiki for an overview of the pipeline
For your first question - we want all datasets!
Oh yes! So silly! I get it now. Thanks! :)
Not silly at all! If anything it highlights there's more we could do to make it clearer for new contributors. Thanks for taking this on and shout if we can help with any.
Hey all. Sorry another question. 🙈 I am writing the first version of the script and kind of hit a roadblock.
There is this drop down menu here https://www.nrscotland.gov.uk/statistics-and-data
Some of the links in the drop down have a CSV/XLSX download directly. But for the others, you need to click on a few more links in order to reach the CSV/XLSX download link. My question is, is it fine for the first version that I write a script that gets the data that is available directly from the drop down?
Please let me know if the question is not clear :)
Thanks
Hey @k7ragav. Sorry it's taken so long to get back to you!
Just getting the data that is directly available from the dropdown is fine for a first draft.
Sadly, there doesn't appear to be any API. glhf.
https://www.nrscotland.gov.uk/statistics-and-data