tagbase / tagbase-server

tagbase-server is a data management web service for working with eTUFF and nc-eTAG files.
https://oiip.jpl.nasa.gov/doc/OIIP_Deliverable7.4_TagbasePostgreSQLeTUFF_UserGuide.pdf
Apache License 2.0
7 stars 2 forks source link

Create events_log table #174

Open lewismc opened 1 year ago

lewismc commented 1 year ago

So far we have identified two buckets of anomalies which can occur during ingestions

In both of the above cases, each individual offense would generate an separate Slack alert. This can be noisy and overwhelming at time so it needs to be improved.

@renato2099 suggested that we create an anomaly table which would, in the instance of an anomaly` generate an entry detailing what the anomaly is. All anomalies for a given submission would be grouped and persisted for archival purposes. This allows for

  1. A single Slack notification detailing an HTTP location URL of a single aggregated report containing one or more anomalies, and
  2. The ability for the user to then Execute a GET on the URL to access the JSON anomaly report for a given submission

This task therefore requires that we

  1. Design the anomaly table
  2. Link the table to the submission.anomaly_report column which will be available post #173
  3. Augment the OpenAPI to facilitate anomaly report access via GET
  4. Implement the logic to generate anomaly reports which covers the metadata and other buckets described above.
  5. Integrate report alerts into Slack messaging
  6. Tests which cover FAILED ingestion scenarios
tagtuna commented 1 year ago

This captures well the flow - I would point out though, at least with our current design, we aim to utilize two different Slack channels, metadata_ops and deploy_ops, I wonder whether we should flag the anomaly reporting in similar categories, e.g., in the anomaly table, there is atype field with possible values such as "metadata", "missing entries". This value list will grow as we identify more buckets of anomalies?

lewismc commented 1 year ago

I really like the sound of that yes. I was also thinking that we could avoid the creation of a new table but add a report column to the submission table however getting data out becomes a bit more tricky because we have to use non-standard/complex data types to represent key-values e.g. {"metadata", "This is a description of the metadata anomaly"} ... rather than explicit rows which make it really easy to query for all anomalies of a particular type for a given submission. I think we can implement the dedicated anomaly table with the foreign key and types as you suggested. We don't need to make the anomaly type an ENUM right now.

tagtuna commented 1 year ago

I think an anomaly table is a cleaner way to organize and it's easier to use as well. So we don't have to bend ourselves to fit things into submission

lewismc commented 1 year ago

Agreed. Thanks. I'll implement.

vtsontos commented 1 year ago

HI guys, thinking a bit more about this, I think could be good to have an "Events_Log" table that would capture the status of all key database event operations, and whether success or anomalies were encountered with whatever descriptive information can be recorded. A standardized event_status code table could be devised. See the attached table proposal with examples. I think this approach allows us to breakdown and record outcomes for each step in the process in a consistent manner, and should be extensible to allow for additions/changes in future. Let me know what you think..

Tagbase_EventsTableProposal.xlsx

lewismc commented 1 year ago

I like it @vtsontos I'll implement that.

lewismc commented 1 year ago

This issue now supersedes #173 Essentially the parts which can be cherry-picked are

CREATE TYPE status_enum AS ENUM ('FAILED', 'FINISHED', 'KILLED', 'MIGRATION', 'POSTMIGRATION', 'PREMIGRATION');

ALTER TABLE ONLY event_log
    ADD CONSTRAINT event_log_submission_fkey FOREIGN KEY (submission_id, tag_id) REFERENCES submission(submission_id, tag_id);