tagbase / tagbase-server

tagbase-server is a data management web service for working with eTUFF and nc-eTAG files.
https://oiip.jpl.nasa.gov/doc/OIIP_Deliverable7.4_TagbasePostgreSQLeTUFF_UserGuide.pdf
Apache License 2.0
7 stars 2 forks source link

Create granular checksum logic representing metadata, data, profile and global eTUFF characteristics #270

Closed lewismc closed 1 year ago

lewismc commented 1 year ago

The 64-bit sha256 checksum calculated for each file and stored in submission.hash_sha256 satisfies initial duplication detection. Post #238 a new use cases arose whereby updates to the reference track need to be accommodated e.g. user ingests initial file representing the reference track and subsequently ingests another file with literally ONLY the flag_as_reference metadata value changed. This would require updates to only for the metadata flag_as_reference value and data_position.flag_as_reference for an existing submission. This would prevent us from (re)ingesting all of the data and track again.

Essentially we need to extend the concept of submission.hash_sha256 to something like

There are some nuanced complexities associated with this proposal e.g., we need to determine if calculating a SHA256 hash directly from a file would yield the same hash as when one is calculated for a PostgreSQL response supposedly representing the same data.

tagtuna commented 1 year ago

I would add a second use case that is an extension of the one described above.

lewismc commented 1 year ago

Duplicate of #272