alan-turing-institute / TuringDataStories

TuringDataStories: An open community creating “Data Stories”: A mix of open data, code, narrative 💬, visuals 📊📈 and knowledge 🧠 to help understand the world around us.
Other
39 stars 12 forks source link

[Turing Data Story] Building a simple web scraper #132

Open samvanstroud opened 3 years ago

samvanstroud commented 3 years ago

Story description

Please provide a high level description of the Turing Data Story We could write a simple web scraper to show how datasets can be generated from unstructured information available on the web. We could then openly publish the dataset and walk the reader through this process too.

Which datasets will you be using in this Turing Data Story? Would be an option to write this story alongside #124, and use this story to scrape the data from the BBC website that would be used in #124. Or any other non-problematic source of information on the web.

Additional context We could discuss the ethical and legal implications of scraping data, talk more broadly about data harvesting in our society. @DavidBeavan had some nice thoughts along these lines.

Ethical guideline

Ideally a Turing Data Story has these properties and follows the 5 safes framework.

Current status

Updates