owid / covid-19-data

Data on COVID-19 (coronavirus) cases, deaths, hospitalizations, tests • All countries • Updated daily by Our World in Data
https://ourworldindata.org/coronavirus
5.66k stars 3.64k forks source link

Incomplete vaccination data in France #407

Closed leyan closed 3 years ago

leyan commented 3 years ago

It seems that for France the vaccination data is missing (daily vaccination) or wrong (7 day rolling average) since 2021-01-18.

The dataset released by the government should be enough to get all the needed information: https://www.data.gouv.fr/fr/datasets/donnees-relatives-aux-personnes-vaccinees-contre-la-covid-19-1

edomt commented 3 years ago

Hi @leyan

The data we currently show for France is, as far as I understand the files made available, the best time series we can recreate with what we're given.

The main national file (vacsi-fra-2021-01-31-20h15.csv currently) only includes data on first doses. From December 27 to January 18, we can safely assume that all doses were first doses, so people_vaccinated == total_vaccinations.

But from January 19 onwards, with the start of second doses, we can no longer make that assumption. total_vaccinations should instead be equal to people_vaccinated + some amount of second doses, but we don't know this amount because none of the files on https://www.data.gouv.fr/fr/datasets/donnees-relatives-aux-personnes-vaccinees-contre-la-covid-19-1/#_ include a time series of second doses.

Instead, the only information we have on second doses are snapshots of the latest cumulative total in the "tot" files such as vacsi-tot-fra-2021-01-31-20h15.csv.

I assume that this odd way of reporting second-dose data will soon be resolved. And of course please let me know if I've misunderstood something or missed a file.

leyan commented 3 years ago

ok, if you restrict your data sources to files providing the whole timeseries, I understand it is not yet available, there is only the most recent value so you would need to download it every day. I agree it is odd to only provide daily snapshots without historization, but maybe it is because the initial data still needs consolidation, I hope the government provides the whole data soon.

Still, the 7 days moving average seems off, it does not really make sense to compute it from 25th to 30th without the underlying raw data available.

edomt commented 3 years ago

See https://github.com/owid/covid-19-data/tree/master/public/data/vaccinations#vaccination-data for a longer definition of how we calculate daily vaccinations.