lincc-frameworks / fibad

FIBAD - Framework for Image-Based Anomaly Detection
MIT License
4 stars 0 forks source link

Verify production dataset does not have malformed manifests #126

Open mtauraso opened 21 hours ago

mtauraso commented 21 hours ago

Through an unknown set of steps Drew managed to create a set of downloaded files (and resulting manifest) where mismatches existed between the filenames of the fits files and the metadata columns (object_id, ra, dec, tract, filter).

It is believed that some set of invocations of the downloader, possibly using nonzero offsets can create this condition. If the condition can be created with zero-length offsets then the production data is definitely affected.

mtauraso commented 21 hours ago

Probably the first step is to write a short script to detect this condition and run it in prod to see if we need to heal prod.

Then we can try to figure out how such a thing could have happened.

mtauraso commented 21 hours ago

manifest.fits.zip