tagbase / tagbase-server

tagbase-server is a data management web service for working with eTUFF and nc-eTAG files.
https://oiip.jpl.nasa.gov/doc/OIIP_Deliverable7.4_TagbasePostgreSQLeTUFF_UserGuide.pdf
Apache License 2.0
7 stars 2 forks source link

ISSUE-272 Support upserts and deletes on metadata only #274

Closed renato2099 closed 1 year ago

renato2099 commented 1 year ago

added some unit tests @lewismc , I will add some more such that we make SonarCloud pass 👍

sonarcloud[bot] commented 1 year ago

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 1 Code Smell

59.0% 59.0% Coverage
0.0% 0.0% Duplication

lewismc commented 1 year ago

We still need to implement logic to accommodate the following scenarios

  1. not duplicate, metadata changed, content unchanged, same dataset (update existing submission)
  2. not duplicate, metadata changed, content changed, same dataset (new submission)
  3. not duplicate, metadata unchanged, content changed, same dataset (updated submission)

Once we do this we have a first pass at this feature.

tagtuna commented 1 year ago

@lewismc The simplest answer is, "update" is carried out when only "metadata" is changed. Whenever "content" is changed, we need a new submission. Please review the following table to see if this makes sense https://docs.google.com/spreadsheets/d/15yeNzMa7R-TuL6KfWyk21-XzwAi9X9sno5G3Ua0Fc-I/edit?usp=sharing

lewismc commented 1 year ago

@renato2099 there were quite a few areas where I had to adjust parameters. Some other bugs as well which I can explain more about when we next sync.

If you look at the output of logger.info("len number_global_atttributes_lines: '%s' len lines_length: '%s'", number_global_atttributes_lines, lines_length) you will see that we are only processing one line! This is another big which we need to fix.

The debugging here was interesting as it had been a while since I looked at this part of the codebase... around a month or so. This was a useful exercise and a refresher.

After we fix these lingering issues, test and cut a release we should have a tagup to ensure that our ingestion logic is as clean as it can be e.g. only opening any given file at most once...

renato2099 commented 1 year ago

If you look at the output of logger.info("len number_global_atttributes_lines: '%s' len lines_length: '%s'", number_global_atttributes_lines, lines_length) you will see that we are only processing one line! This is another big which we need to fix.

@lewismc I fixed that in the latest commit + added some additional assertions to the unit tests

sonarcloud[bot] commented 1 year ago

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 2 Code Smells

40.5% 40.5% Coverage
0.0% 0.0% Duplication

lewismc commented 1 year ago

Hi @renato2099 this passed all CI and both of our personal testing. I will open a new ticket for adding more unit tests to the ingestion logic so we can work in the next phase.

renato2099 commented 11 months ago

I agree with this @lewismc , it's better to continue moving and address missing features separately, this PR is already big enough