Safecast / safecastapi

The app that powers api.safecast.org
44 stars 25 forks source link

Add soft delete column to measurements #361

Open Frangible opened 7 years ago

Frangible commented 7 years ago

(NOTE: This is regarding the measurements table on ingest, not api.safecast.org)

Recently an issue occurred where a fair amount of known bad data was inserted into the ingest measurements table, which result in output and visualization errors displayed to the user.

However, there is currently no mechanism for marking rows in measurements as invalidated or deleted, and other alternatives (modifying the original data, hard dropping the row, etc) are less than ideal.

To that end, a column should be added to table measurements, named "invalidated" or "deleted" or something to that effect.

The column could be either a boolean or timestamp, depending upon whether capturing the timestamp of the soft deletion or using less storage / IOPS is desired. (8 bytes vs. 1 byte) I personally lean more towards the boolean and tracking all soft deletes via GitHub issues.

ALTER TABLE measurements ADD COLUMN invalidated BOOLEAN DEFAULT FALSE NOT NULL;
CREATE INDEX idx_measurements_invalidated ON measurements(invalidated);

Or something to that effect should work. Whenever anything queries measurements, including any future Ruby application, this column must be tested for. (eg, AND NOT invalidated)

juhele commented 7 years ago

Not direct solutution within the API but we "clean" the LOG files in QGIS using: https://opengeolabs.github.io/qgis-safecast-plugin/index.html and then upload the LOG file here.

The same if I have one log file containing for example car ride and walk with detector in different height and facing and I need to split it in two LOG files.

matschaffer commented 7 years ago

Thanks @juhele - the "ingest" tag tickets are for the solarcast pipeline (https://github.com/safecast/ingest) so no log files involved, but I'm sure it won't be too long before we implement some sort of "invalidated" flag.

juhele commented 7 years ago

OK, thanks for clarifying this @matschaffer :-)