opencovid19-fr / data

Consolidation des données de sources officielles concernant l'épidémie de COVID19
MIT License
285 stars 102 forks source link

From opencovid19 data to jhu data #543

Closed odadoun closed 3 years ago

odadoun commented 3 years ago


first of all thanks for sharing your resources. I am wondering which data from opencovid19 are used to update the JHU csv files ? By advance thank you Cheers Olivier PS: I have a huge difference between deaths from SPF et JHU (maybe it is explain by the fact the ehpad's deaths are not taking into account) Capture d’écran 2020-09-22 à 11 53 49

abdoulsn commented 3 years ago

This is two different repo. no direct link. If i understand you question.

odadoun commented 3 years ago

Thanks Abdoulaye. I Asked JHU yesterday, they told me: "Our data is sourced from, which is the official France source" . I would like to know where I can get the information for "EHPAD et EMS", it should be opendata isn't it ? Basically from the French institutes informations (like SPF) I would like to retrieve the JHU total deaths. Thank you Bonne journée Olivier PS: I have started the thread in French if you think it is not needed we can continue in French :-)

abdoulsn commented 3 years ago

I see, our data are stored in data/dist/* or this link.

odadoun commented 3 years ago

Thanks. I have analysed the data. It seems that there is no granularity for what concern the dead in the EHPAD. I mean only the national values are available (not by department nor regions), do you confirme that ? There is a small difference between JHU and opencovid19 data ('deces EPAD'+'deces') Capture d’écran 2020-09-23 à 18 53 27 Capture d’écran 2020-09-23 à 19 03 05

Do you have any idea where those difference is coming from ? Thanks a lot for your patience Olivier

Dowser101 commented 3 years ago

Hello Olivier The only clue i have on this is this article from 2nd half of May. I did not investigate further. It states that some kind of mess was brought to light following the Spanish prime minister declarations controversy in May. He officially announced some results from JHU, claiming spain was one of the most advanced countries in terms of population testing campaigns, worldwide.

That numbers were proved wrong, and that lead to a kind of mea culpa from the John Hopkins side. They acknowledged the use of some Worldometer numbers to implement their own pandemic publications. Trouble is that Worldometer numbers collection was based on social media and implementation by volunteers. Of course they don’t publish any proprietary algorithm. But it is obvious that this would lead to a curve that stands ahead of the real numbers, officially confirmed by french government.

Basically they have been suspected of a biased selection of the higher scores declared by contributors. The fact that the curves begin to stick to each other after this polemic makes me think they just stop to do so after being forced to reckon this was not what readers expect from such a supposedly reliable organisation.

There could have been a "supplementary" twist as some results were published by SPF, and relayed on social media and online press many hours before the ministry of health official press communication. The story is related here :

Best regards

odadoun commented 3 years ago


thanks to taking the time to answer me. I think the problem is that JHU do not take into account Corse (perhaps the problem is from 1 digit + 1 letter?) + Dom-Tom (maybe the problem if from 3 digits?). If we compare (total deaths from SPF database in all France departments except Corse+DOM-TOM) + (total deaths in EHPAD from opencovid19 database in France, I guess it is for all departments) to JHU I rather have the same value : 31181 vs 31103 image From the plot we can observe a shift between the data (around 2 months) This shift is also observed between deaths using SPF database and OpenCovid19 one. image Times shifting also observed for resuscitations Capture d’écran 2020-09-24 à 20 27 53 I have two questions:

Dowser101 commented 3 years ago

As for your first question, I did not manage to decipher the colors right, sorry. So... JHU is the LOWER line, stupid me! The worldometer problem that was pointed in May definitely did not apply to french figures. Then i don’t understand what is shown on your last posts. 2 monts shift on deaths and réa? How come? The peak day for the number of réa is the 8/04 with 6954. The 10 000 deaths threshold was passed on 07/04. Where did you get those peaks in February. I doubt this can be published by SPF or anywhere else... (please choose a definitive color for SPF and the other for opencovid, or i fear to become stupid again ;-) The national statistics of contamination and deaths in EHPAD have been proved inconsistent when it was made everyday, people "on the field" had no enough time to collect that numbers and i don’t even know the way it was compiled But then it started to be accounted once a week, and it often resulted in NEGATIVE results. I doubt anything valuable can be found for each departments in this conditions.

Dowser101 commented 3 years ago

i could find some data sheet with the ehpad details, but you will have to check further to know where the conceptor got the numbers ! It is said that it comes from SPF apart from EHPAD data, and opencovid19-fr is the other source... So maybe it is somewhere here after all...!/vizhome/CoronavirusenFrance/CovidenFrancepardpartement

odadoun commented 3 years ago

Ouf ! Sorry it was a bug in my code due to a bad python pandas concatenation. So sfp and opencovid-19 are very closed hospitalises




Now I will check the link you sent me and continue my benchmark. Thanks a lot and have good week end cheers Olivier

Dowser101 commented 3 years ago

Contact the girl, it seems that she got it here but i don’t know where it could be hidden

abdoulsn commented 3 years ago

Feel free to open.