NASA-PDS / registry-sweepers

Scripts that run regularly on the registry database, to clean and consolidate information
Apache License 2.0
0 stars 1 forks source link

Ancestry #10

Closed alexdunnjpl closed 1 year ago

alexdunnjpl commented 1 year ago

🗒️ Summary

⚙️ Test Data and/or Report

Ancestry sweeper needs tests as part of I&T suite (as does provenance)

Functional tests implemented/pass for ancestry sweeper Unit tests added for pds.registrysweepers.utils.productidentifiers

♻️ Related Issues

fixes #14

supports https://github.com/NASA-PDS/registry-api/issues/318 supports https://github.com/NASA-PDS/registry-api/issues/319

Some TODOs/minor-features outstanding:

@jordanpadams if that last one isn't actually important/useful, let me know.

alexdunnjpl commented 1 year ago

converting back to draft to ensure that outstanding minor items in OP aren't overlooked when I come back to this next week

alexdunnjpl commented 1 year ago

@tloubrieu-jpl ready for review

alexdunnjpl commented 1 year ago

@jordanpadams see https://github.com/NASA-PDS/registry-sweepers/pull/10/commits/95489170c96467974272a813e641d3047f04c37a for implementation and test-cases for support of alternative_ids

tloubrieu-jpl commented 1 year ago

@alexdunnjpl I removed the sprint-backlog tag from the PR, since this is bugging me in the zenhub board.

alexdunnjpl commented 1 year ago

Mostly not critical comments from me, but sonatype and Sean's comments require some changes .

@tloubrieu-jpl what of Sean's requires further action?

Regarding Sonatype, are you aware of a way to trigger a fresh sweep of a PR? I think possibly the force pushes are confusing it. Or are you just talking about the one issue mentioned in your recent comment?

alexdunnjpl commented 1 year ago

@tloubrieu-jpl regarding the outstanding sonatype error, ~it's a false positive - not picked up by mypy either.~ actually it's stale - that line is correctly annotated since 6c38aa5

I've removed some obsolete mypy ignores, but otherwise it should be good to go.

alexdunnjpl commented 1 year ago

Just realised there's one outstanding question for @jordanpadams. What exactly do you want the metadata keys to be?

Currently it's ops:Provenance/ops:parent_collection_identifiers and ops:Provenance/ops:parent_bundle_identifiers, but the pluralisation, while arguably correct, may be at odds with user expectations given fields like pds:Internal_Reference/pds:lid_reference which are collections but whose field name is singular.

jordanpadams commented 1 year ago

@alexdunnjpl let's go with the singular

tloubrieu-jpl commented 1 year ago

Mostly not critical comments from me, but sonatype and Sean's comments require some changes .

@tloubrieu-jpl what of Sean's requires further action?

@alexdunnjpl I think Sean requested a change for this comment https://github.com/NASA-PDS/registry-sweepers/pull/10/files#r1192418906

tloubrieu-jpl commented 1 year ago

No sorry Sean said it does not need change. Let's validate the PR