CSSEGISandData / COVID-19

Novel Coronavirus (COVID-19) Cases, provided by JHU CSSE
https://systems.jhu.edu/research/public-health/ncov/
29.13k stars 18.43k forks source link

France: still wrong numbers of confirmed cases (April 17) #2270

Open alfkoehn opened 4 years ago

alfkoehn commented 4 years ago

Number in database here: 147969

Number in other data sources [1,2]: 109252

38717 cases too much, probably due to also including potential cases (and not only confirmed cases), see e.g. the discussion [3] and an official statement [4], where it was promised to fix that (note that a few days have been fixed, see [5]).

[1] https://github.com/opencovid19-fr/data/blob/master/ministere-sante/2020-04-16.yaml [2] https://dashboard.covid19.data.gouv.fr/ [3] https://github.com/CSSEGISandData/COVID-19/issues/2005 [4] https://github.com/CSSEGISandData/COVID-19/issues/2094 [5] https://github.com/CSSEGISandData/COVID-19/issues/2259

MarioGomWiki commented 4 years ago

@alfkoehn JHU CSSE includes probable cases in nursing homes for France. You can check the figure here: https://dashboard.covid19.data.gouv.fr/

It seems it is a decision they explicitly made. They include confirmed and probable cases for some countries. For other countries they include confirmed cases only.

JiPiBi commented 4 years ago

@alfkoehn
@MarioGomWiki is right , for a strange reason JHU closed 2094 with that comment

Update 4/16: after communicating with solidarites-sante.gouv.fr, we decided to make these adjustments based on public available information. From April 4 to April 11, only "cas confirmés" are counted as confirmed cases in our dashboard. Starting from April 12, both "cas confirmés" and "cas possibles en ESMS" (probable cases from ESMS) are counted into confirmed cases in our dashboard.

Just a supplementary remark , in french probable is not the same than possible , probable means quite sure with high probability , possible means suspicious but no proof

Another remark , J SALOMON french Health Director admitted yesterday that as many tests can be positive on the same people , confirmed value is an estimated value not based on a list of identified people positive at least one time.

FrancisWasserman commented 4 years ago

It may be useful to look at the history of these French data.

Till Apr 3, only one series was published : "Confirmed cases" ie tested in an hospital (but at that time all tests were conducted in hospitals only). Nursing homes were ignored then. At the of March, officials announced that statistics would soon be extended to take nursing homes into account.

On Apr 1, the first figures about nursing homes were published as a new series "Deaths in nursing homes" was started. It was explicitely stated then that these deaths were added and included in the "Confirmed cases" series, so as to begin to take into account the nursing homes.

On Apr 4, a new series about the number of cases was started : "Nursing Homes". No definition in terms of testing wes given for this series. No statement was made about the link between the "Confirmed cases" and "Nursing homes" series. The figures then were :
"Confirmed Cases" 64338 (Apr 3), 68605 (Apr 4) and "Nursing Homes" 22195 (Apr 4). Given this history, it is difficult to think that these new 22195 are a detail of the 68605 "confirmed cases". The new "Nursing homes" series was clearly to be added to the older "Confirmed cases" series.

On Apr 12, the "Nursing homes" series was split into 2 new series : "Nursing homes confirmed cases" (ie tested) and "Nursing homes probable cases" (ie symptoms only). On that day, the figures were :
"Nursing homes" 35864 (Apr 11) evolved as "Nursing homes confirmed cases" 11958 (Apr 12) and "Nursing homes probable cases" 25230 (Apr 12), the total sum (Apr 12) being 37188. The 2 new series offered only a new level of detail (confirmed vs probable) about nursing homes, without any change in the surveyed matter.

As the "Nursing homes" series was to be added to the "Confirmed cases" series during the April 04-11 period, it would be consistent to add all 3 series (confirmed cases, nursing homes confirmed + probable) after Apr 12.

JiPiBi commented 4 years ago

@FrancisWasserman I disagree with your proposal , as J SALOMON declared the 14thof March , the confirmed cases in ESMS are already included in national confirmed , so don't add them twice And as I said above , whatever you think about official communications , you cannot use the word probable as in french they said possible

MarioGomWiki commented 4 years ago

@FrancisWasserman On Apr 4, a new series about the number of cases was started : "Nursing Homes". No definition in terms of testing wes given for this series.

This always partially overlapped with "confirmed cases". It originally included confirmed deaths in EHPAD (included in "Confirmed cases", as you can verify in the official breakdown), confirmed cases in EHPAD (included in confirmed cases total too according to the official source) and probable cases in EHPAD (not included in the total).

JiPiBi commented 4 years ago

@MarioGomWiki Desperate having been not clear enough to make you understand the difference between probable and possible.....

FrancisWasserman commented 4 years ago

@JiPiBi Please see the official site : https://dashboard.covid19.data.gouv.fr/ You will find there stats about "cas probables en EHPAD et EMS". The word "possible" is not used in the definition of these stats. The french word "probable" can be translated by "likely" or "probable" in English. It is true that J. Salomon used the french word "possible" on TV. But "possible" has a weaker meaning than "probable" in french. Something may be "possible" but not "probable". And to be "probable" it has to be "possible".

JiPiBi commented 4 years ago

@FrancisWasserman What do you mean by official ? It' s a dashboard using data found on data.gouv.fr + aggregating SPF daily communication.

But the only official voice is J SALOMON and the only official communication is on SPF site , like that one https://solidarites-sante.gouv.fr/IMG/pdf/point_de_situation_du_14_avril_2020.pdf

As french I understand quite well the difference between possible and probable in french , so I dont need a detailled explanation :-) but I agree with yours

Please apart from this semantic discussion, consider that the official confirmed in ESMS are already integrated in confirmed total and mostly dont be too confident in these confirmed values that are only statistic building and as you know, as less than 1% of the population has been tested, they are not very important at that moment of the epidemic. Perhaps after the deconfinment, the treatment of confirmed cases with systematic tests should become very important to master a new outbreak , but now as they are mainly used to confirm symptomatic people already arrived in hospitals, IMHO they are not a real KPI . New hosp, new ICU are more important and this site doesnt understand that and goes on with the same initial and currently unefficient informations. (Worldometers is giving more complete informations on current ICU values , but new hosp and new ICU are neither available) A bit sad.

FrancisWasserman commented 4 years ago

@JiPiBi Anything with an URL ending with gouv.fr is supposed to be official.

As a French I could not help but explain something about my language ;-)

I agree with you about the reliability and the relative importance of the number of cases as we know it today. In fact, I was dismayed after founding many different values of the number of cases for Apr 16 : 108847, 145960, 165027, 106206, 127814... On that day, some data about ESMS had to be corrected and methodology was modified on some sites (J. Hopkins, Wordlometers...).

Thanks for your answer.

JiPiBi commented 4 years ago

If you go on data.gouv.fr, you get the official values, but you have nothing about ESMS. On this site also, you find a lot of simple citizens using these data and publishing their own dashboard , but no one is official . On EPHAD and ESMS the only official values are given by J SALOMON and the SPF dahboard, and I dont know another official source. Have a good night.

FrancisWasserman commented 4 years ago

@JiPiBi I was too naive. I forgot to check the upper left corner for an official logo...

traut21 commented 4 years ago

see issue https://github.com/CSSEGISandData/COVID-19/issues/2340

traut21 commented 4 years ago

since #2340 was closed, please consider and/or merge with https://github.com/CSSEGISandData/COVID-19/pull/2211

Official sources claim to include data from EHPAD and EMS since 2020-04-01. Thus I now do ignore the garbaged data from JHU and do use https://www.ecdc.europa.eu/sites/default/files/documents/COVID-19-geographic-disbtribution-worldwide.xlsx instead

check this chart how far those numbers differ by now!

france

alfkoehn commented 4 years ago

Sorry for a potentially stupid question, but how to merge issues?

traut21 commented 4 years ago

I guess it's a job for the admins to close the other issue and redirect it to here