ccodwg / Covid19CanadaArchive

Canadian COVID-19 Data Archive
https://opencovid.ca
Other
22 stars 10 forks source link

Retire some ON datasets #279

Closed jeanpaulrsoucy closed 7 months ago

jeanpaulrsoucy commented 1 year ago

It is time to review and retire some old ON datasets, e.g., see the following: https://data.ontario.ca/dataset/status-of-covid-19-cases-in-ontario-by-public-health-unit-phu

jeanpaulrsoucy commented 1 year ago

Will have to watch the ON confirmed case datasets.

The main dataset has cases with report dates going up to 2022-11-30, whereas the new 2022 dataset has cases with report dates going up to 2022-12-12. This is despite the 2022 dataset being marked as going up to 2022-11-30 in the metadata.

So it is possible one or more of these datasets will continue to be updated, or it is also possible that they may all need to be retired in the near future.

jeanpaulrsoucy commented 1 year ago

Looks like two datasets that got retired are being updated again: bfab2b25-1588-4be8-b1c8-2d655db4d18a and bfab2b25-1588-4be8-b1c8-2d655db4d18a. The two pages say that they were last updated on December 21, so I may have missed some previous updates (even though there was a period of non-updates, which is why I retired them in the last place).

jeanpaulrsoucy commented 1 year ago

I confirmed the above by re-running the archive tool manually for the five retired datasets. I can confirm that the other three (cdf98171-8342-46f6-b4e8-d0ea3e00c734, c40ee593-226d-4ccd-ba56-cd426435f286, 73fffd44-fbad-4de8-8d32-00cc5ae180a6, i.e., two outbreak datasets and the status of cases by PHU dataset) are still dead, with the final unique file date of 2022-12-01, as expected.

jeanpaulrsoucy commented 1 year ago

After a few weeks, the confirmed cases dataset was finally updated with cases reported up to 2022-12-07 on 2022-12-22.

jeanpaulrsoucy commented 1 year ago

Interestingly, the retired outbreaks files are showing on the Ontario open data portal as being "updated" on 2022-12-22, but comparing the file hashes to the file hashes from 2022-12-01, they are identical.

jeanpaulrsoucy commented 1 year ago

The only two files I am still watching are the vaccine datasets a7f839da-8c36-4569-bd99-1a07adc0700a and 5fe86db2-62c9-4c92-84cb-0ef07069291b, which are listed as being still updated, but haven't had new data since April.