PacktPublishing / Distributed-Data-Systems-with-Azure-Databricks

Distributed Data Systems with Azure Databricks, published by Packt
MIT License
12 stars 10 forks source link

Chapter 3: Where is the correct data to download? #1

Open tanthiamhuat opened 2 years ago

tanthiamhuat commented 2 years ago

Hi Alan, Thanks for your book and the code, I am trying to follow it in Chapter 3. Can I know which and where is the CSV file downloaded from? I follow the link https://data.world/government/vep-turnout from your book, and downloaded 2020 November General Election - Turnout Rates.csv which contains only 53 rows of data, and its column on Official/Unofficial contains only those equal to "Official". I see from your Figure 3.17 and Figure 3.19 that it contains "Unofficial" rows, so I am wondering if the data downloaded changes.

If the data from the website changes, are you able to give your original copy of CSV file in your github here? Thanks.

DataSpacon commented 1 year ago

Dataset was changed. I manually updated some rows to include "Unofficial"