Open XuedaShen opened 1 year ago
Date of last change
and Number of data revisions
refer to things like semantic changes or processing fixes on datasets, a list of which can be seen at https://github.com/cmu-delphi/delphi-epidata/blob/dev/docs/api/covidcast_changelog.md. These are typically abnormal, unexpected, and rare.
Backfill is not related to these -- it is fairly common and is an artifact of reporting chains and methods.
The documentation you linked to does not make this distinction obvious, and it should probably be clarified.
"Data revisions" is actually a better term than "backfill" for the versioning process, so it'd be nice to maybe rename the current "Data revisions" to something a little different. Backfill sometimes refers to a specific type of revision mechanism where case/hosp/death/etc. reports for events that occurred in the past are reported at various different future dates, gradually filling in from a backlog/pipeline of reports.
On a related note, maybe we could consider putting a glossary/explaination of terms in Data Sources and Signals. If it is not too much work, maybe a table of contents in individual data source could also highlight the distinction between backfill
and Data revisions
.
closely related:
Is there a way to distinguish backfill from
Date of Last Change
andNumber of data revision
in the documentation? I got confused that 0 in latter columns indicated that there is no backfill for the signal.