Repository for Global.health: a data science initiative to enable rapid sharing of trusted and open public health data to advance the response to infectious diseases.
Add support for generating and ingesting source differences for non-UUID sources
Differences ('deltas') are generated between the current fetched source and the last successfully processed source.
In order to considerably speed-up ingestion, deltas are generated between source files upon retrieval, prior to parsing.
Deltas are split into Addition and Deletion files. A Suspected case converting to Confirmed would therefore generate both a Deletion file and an Addition file as source deltas.
For Deletion deltas, the first matching record is marked for exclusion from the line list. Any cases marked for exclusion at the end of processing are removed from the database. If marking fails, any marked records are reverted prior to pruning.
If a delta ingestion fails (either an Addition or a Deletion) then bulk ingest during the next scheduled retrieval (all records are replaced). This is required as Addition and Deletion deltas are currently assigned separate upload IDs for processing which might result in desynchronisation of the database if one succeeds without the other.
Add support for generating and ingesting source differences for non-UUID sources