Lucas-Czarnecki / COVID-19-CLEANED-JHUCSSE

Cleaned daily reports and time series data from the 2019 Novel Coronavirus COVID-19 (2019-nCoV) Data Repository by Johns Hopkins University for Systems Science and Engineering (JHU CSSE).
12 stars 6 forks source link

UID isn't unique #14

Closed wa-sharifk closed 3 years ago

wa-sharifk commented 3 years ago

I'm importing COVID-19_CLEAN/csse_covid_19_clean_data/CSSE_DailyReports.csv from the master branch and discovered that UID isn't unique. For example, the value 84054089 is repeated 179 times

Lucas-Czarnecki commented 3 years ago

Thank you again @wa-sharifk. I am hesitant to modify values in UID as my approach has been to avoid manipulating JHU's original data as much as possible. What I can do is add my own unique ID column (i.e., row number) to the data and document your issue in the README.

wa-sharifk commented 3 years ago

@Lucas-Czarnecki I totally get not changing it... perhaps a column rename and a UID column that is unique? These to are just suggestions.

Honestly even documentation if fine I was just surprised to run into duplicates while processing some visualizations.

You've done a wonderful job so far and I'm really impressed. Keep up the great work.

Lucas-Czarnecki commented 3 years ago

Resolved with dcabec5192fe22c5c99a3842e1c5553e0fbefd50.