govex / COVID-19

Data analysis and visualizations of daily COVID cases report
MIT License
206 stars 155 forks source link

Inconsistencies in vaccination data #105

Open Sylvain-Royer opened 3 years ago

Sylvain-Royer commented 3 years ago

Regarding the vaccination data:

https://github.com/govex/COVID-19/blob/master/data_tables/vaccine_data/raw_data/vaccine_data_us_state_timeline.csv

Is there any reason why some dates are skipped and why data is sometimes left blank to (I assume) indicate the previous value should be used instead of empty?

Would it be possible to fill out the data more explicitly?

Thank you

sarabertrandelis commented 3 years ago

Hi @Sylvain-Royer, missing dates happened at the time when States were reporting with press releases. No State with dashboard should have dates missing since the dashboard went public.

On the other hand, values are empty when the State does not report the value. If there used to a value, and in recent dates is left blank, is because the State stopped reporting that metric. Otherwise, when there is no update but the value still appears on the dashboard, we fill the row with the previous values.

mr-devs commented 3 years ago

I'd just like to add that there are also inconsistencies with respect to updating all of the rows. In the same raw data file mentioned above, this is easy to see if you look at the doses_administered columns in comparison to the people total column. Here is a screenshot (notice the bottom right portion):

image

Another example of this is the data for California, for which there is no people total data at all.