Open tristpinsm opened 2 months ago
I think Matthew envisioned this jobdb being used for that purpose: https://github.com/simonsobs/sotodlib/pull/851 but I have yet to test and review this PR :/
Oh interesting. Could make sense to migrate to that system then rather than continue using Influx to track this.
When an obs_id fails to process its metrics for whatever reason, it isn't added to the InfluxDB such that it show up in the list of unprocessed observations on the next run. Over time, these pile up and will slow down the flow runs.
A solution could be to add an entry in the Influx log that records the failure.