cmu-delphi / delphi-epidata

An open API for epidemiological data.
https://cmu-delphi.github.io/delphi-epidata/
MIT License
100 stars 68 forks source link

Disambiguation between backfill and Date of Last Revision #1252

Open XuedaShen opened 1 year ago

XuedaShen commented 1 year ago

Is there a way to distinguish backfill from Date of Last Change and Number of data revision in the documentation? I got confused that 0 in latter columns indicated that there is no backfill for the signal.

melange396 commented 1 year ago

Date of last change and Number of data revisions refer to things like semantic changes or processing fixes on datasets, a list of which can be seen at https://github.com/cmu-delphi/delphi-epidata/blob/dev/docs/api/covidcast_changelog.md. These are typically abnormal, unexpected, and rare.

Backfill is not related to these -- it is fairly common and is an artifact of reporting chains and methods.

The documentation you linked to does not make this distinction obvious, and it should probably be clarified.

brookslogan commented 1 year ago

"Data revisions" is actually a better term than "backfill" for the versioning process, so it'd be nice to maybe rename the current "Data revisions" to something a little different. Backfill sometimes refers to a specific type of revision mechanism where case/hosp/death/etc. reports for events that occurred in the past are reported at various different future dates, gradually filling in from a backlog/pipeline of reports.

XuedaShen commented 1 year ago

On a related note, maybe we could consider putting a glossary/explaination of terms in Data Sources and Signals. If it is not too much work, maybe a table of contents in individual data source could also highlight the distinction between backfill and Data revisions.

melange396 commented 1 year ago

closely related: