nelsonic / practice

Practice makes ...
11 stars 4 forks source link

Mini Project idea: Scraping, Parsing and Data Visualisation of US Presidents #45

Open nelsonic opened 6 years ago

nelsonic commented 6 years ago

Wikipedia has a great table of data/facts on the US Presidents: https://en.wikipedia.org/wiki/List_of_Presidents_of_the_United_States image

Having a table of data is a good way to display (tabular) data, but it does not display any insight. the idea I'm proposing is to use the data as an opportunity to build a Mini Project / Tutorial:

  1. how to "scrape" data from a public web page.
  2. parse and filter that data using Test Driven Development. 2.b publish the data as CSV so others can use without having to do Scraping/Parsing themselves.
  3. analyse the data to spot any trends, clusters or insights e.g:
    • average age when started presidency
    • average total life span
    • number of years lived after presidency (does being in power shorten lifespan?)
    • number of Democrats vs. Republicans
  4. Visualise the data using Charts making an interactive infographic
  5. Publish as both a tutorial and PWA that tens of millions of American kids can learn with/from.
    • "Win the Internet" as tens of thousands of "middle-schoolers" link/share the content. 🔗

From reading the table I now know the fact that during both World Wars, there was a Democrat in office: Woodrow Wilson and Franklin D. Roosevelt respectively. Can we infer that having a Democrat as President is the "best way" to win a War...? 🤔 (probably not ...) I think it would be instructive to visualise the data and see what Questions/Hypothesis we can derive from it.

Thinking we could do this project in Python given that it's "Data Science" using one of the many brilliant libraries available: https://blog.modeanalytics.com/python-data-visualization-libraries/

Thoughts? (please comment below)