OpenDataScotland / the_od_bods

Collating open data from across Scotland
MIT License
20 stars 18 forks source link

Add National Records Scotland as a source #184

Open KarenJewell opened 1 year ago

KarenJewell commented 1 year ago

Sadly, there doesn't appear to be any API. glhf.

https://www.nrscotland.gov.uk/statistics-and-data

k7ragav commented 1 year ago

Hey all. This looks like an interesting one. Is it possible if I can work on this?

KarenJewell commented 1 year ago

Awesome yeah go for it! I'd recommend self-assigning the ticket so no one else picks it up also.

KarenJewell commented 1 year ago

Oops apparently you're not able to assign so I've just done that now - all yours! Shout if you need anything.

k7ragav commented 1 year ago

Thanks Karen! If possible, could you add more info in the issue? Like which datasets are needed :)

k7ragav commented 1 year ago

Hey. I had a question regarding this. :) I am in the process of thinking of ways to do this. For a link such as this https://www.nrscotland.gov.uk/statistics-and-data/statistics/stats-at-a-glance/registrar-generals-annual-review/2021 There are much more data present in the Excel sheet given here. So would you like to have them all in different CSVs and placed in the folder the_od_bods/data/scraped_results ? Or keep it as an excel sheet?

KarenJewell commented 1 year ago

Sorry I'm not sure I understand the question rightly, but we're not scraping the dataset, only the dataset metadata. Please see merged_output.csv for what the end aggregated goal is. The intermediary structure doesn't matter as long as it can append on to merged_output.

Maybe see /web-scrapers for examples of other web scrapers we use and the wiki for an overview of the pipeline

For your first question - we want all datasets!

k7ragav commented 1 year ago

Oh yes! So silly! I get it now. Thanks! :)

KarenJewell commented 1 year ago

Not silly at all! If anything it highlights there's more we could do to make it clearer for new contributors. Thanks for taking this on and shout if we can help with any.

k7ragav commented 1 year ago

Hey all. Sorry another question. 🙈 I am writing the first version of the script and kind of hit a roadblock.

There is this drop down menu here https://www.nrscotland.gov.uk/statistics-and-data

Some of the links in the drop down have a CSV/XLSX download directly. But for the others, you need to click on a few more links in order to reach the CSV/XLSX download link. My question is, is it fine for the first version that I write a script that gets the data that is available directly from the drop down?

Please let me know if the question is not clear :)

Thanks

JackGilmore commented 1 year ago

Hey @k7ragav. Sorry it's taken so long to get back to you!

Just getting the data that is directly available from the dropdown is fine for a first draft.