CoVaRR-NET / duotang

Scripts and data for the CoVaRR-Net Pillar 6 notebook
https://covarr-net.github.io/duotang/duotang.html
MIT License
2 stars 2 forks source link

Loss of the Health InfoBase case counts #361

Closed bfjia closed 4 months ago

bfjia commented 5 months ago

I spent some time looking at sources of provincial COVID-19 case counts today given the recent news about the InfoBase, and it's not looking great. I welcome suggestions and links if you know there's another source of this information.

Provinces with data still available*:

Provinces that's no longer available:

For reference, we currently have data for BC/AB/ON/QC/NS/NB/NL using the Health InfoBase CSV.

Should we lose the Health InfoBase CSV, we will only have consistent data from Quebec, Alberta is OK and manageable, and New Brunswick with a 1 month delay.

bfjia commented 5 months ago

Potentially could consider using outbreak numbers that is still available via the Health-infobase to track overall Canadian trends. The correlation between outbreak numbers and case numbers is very high. image

bfjia commented 5 months ago
bfjia commented 5 months ago

image

Outbreak/Cases plot with last 4 data points highlighted in red

bfjia commented 5 months ago

Another source of data could be the case positivity rate from the respiratory virus reports. https://www.canada.ca/en/public-health/services/surveillance/respiratory-virus-detections-canada/2023-2024/week-23-ending-june-8-2024.html

bfjia commented 5 months ago

Sally: Could you take a look to see which has been most correlated with reported case numbers (and each other): outbreaks, test positivity, or test positivity*test number

bfjia commented 5 months ago

CaseCount_vs_Positive_Count CaseCount_vs_Positive_Rate CaseCount_vs_Gender_Cases_Count CaseCount_vs_Outbreaks_Count

bfjia commented 5 months ago

Doesnt look like the negative correlation with positivity is due to a change in methodology. Recent data (since 2024-01-01) were all non correlative. CaseCount_vs_Positive_Rate CaseCount_vs_Positive_Count

bfjia commented 5 months ago

Gender data is still updated weekly. Outbreak data is updated monthly. Positivity data updated weekly.

bfjia commented 5 months ago

As of July 3rd, the Outbreaks CSV seems to have changed. Overall, it seems lower compared to before for the last year

This is the correlation for the new outbreak CSV

image

bfjia commented 5 months ago

Link to the outbreak CSV: https://health-infobase.canada.ca/src/data/covidLive/covid19-epiSummary-outbreaks-settings.csv Link to the positivity CSV: https://health-infobase.canada.ca/src/data/covidLive/covid19-epiSummary-labIndicators2.csv Link to the gender CSV: https://health-infobase.canada.ca/src/data/covidLive/covid19-epiSummary-ageGender.csv

bfjia commented 5 months ago

Outbreak CSV from June 18th, 2024: https://1drv.ms/u/s!Ahefjh1UXa259qBCV1eFiccSC9yVPQ?e=clJBjd Outbreak CSV from July 3rd, 2024. https://1drv.ms/u/s!Ahefjh1UXa259qBJUndDLnDhB9EJ1A?e=qn8k6v

spotto commented 5 months ago

I think the positivity data are OK. I took a look at the test positivity times test number and it correlates well with case numbers (see attached excel file). I don't get negative correlations, so maybe there was a bug?

image

CaseCountComparison.xlsx

The little hockey stick at the bottom left is caused by the "number of detections" (=positivity*tests) being more accurate near the present than the case reporting.

spotto commented 5 months ago

Positivity correlates less well (but still positive):

image

[Above use the case counts in covid19-epiSummary-weeklyEpiCur from https://health-infobase.canada.ca/covid-19/current-situation.html ]

spotto commented 5 months ago

Link to the positivity CSV is also still present at the bottom of https://health-infobase.canada.ca/covid-19/ (search for lab indicators)

bfjia commented 5 months ago

Doesnt look like the negative correlation with positivity is due to a change in methodology. Recent data (since 2024-01-01) were all non correlative.

The different here is actually due to the fact that i used a wrong dataset. The Respiratory Virus Detections dashboard didn't actually exist until this week's update. Previously it was reported here via HTML tables. https://web.archive.org/web/20240603032930/https://www.canada.ca/en/public-health/se[…]us-detections-canada/2023-2024/week-18-ending-may-4-2024.html I been using the Positive human coronavirus (HCoV) tests data which is incorrect and not actually COVID related.

spotto commented 5 months ago

Aha! Glad that is sorted. I think we can go with detections moving forward. Interestingly, this page makes it sound like the data reported to PHAC is # of tests and # of detections, from which positivity is calculated (so we're just backtracking to get number of detections): https://health-infobase.canada.ca/respiratory-virus-detections/understanding.html