NASA-PDS / deep-archive

PDS Open Archival Information System (OAIS) utilities, including Submission Information Package (SIP) and Archive Information Package (AIP) generators
https://nasa-pds.github.io/deep-archive/
Other
7 stars 4 forks source link

pds-deep-registry-archive produces invalid SIPs/AIPs #155

Closed jordanpadams closed 6 months ago

jordanpadams commented 7 months ago

Checked for duplicates

Yes - I've already checked

πŸ› Describe the bug

When I ran pds-deep-registry-archive with a LIDVID, and the attempt to validate the files, they are invalid.

πŸ•΅οΈ Expected behavior

I expected the files to be valid.

πŸ“œ To Reproduce

% pds-deep-registry-archive -s PDS_GEO -u https://pds.nasa.gov/api/search/1.0/ urn:nasa:pds:magellan_gxdr::1.0

INFO πŸ‘Ÿ PDS Deep Registry-based Archive, version 1.2.0
INFO πŸ“„ Wrote AIP checksum manifest magellan_gxdr_v1.0_20240130_checksum_manifest_v1.0.tab with 109 entries
INFO πŸ“„ Wrote AIP transfer manifest magellan_gxdr_v1.0_20240130_transfer_manifest_v1.0.tab with 109 entries
INFO πŸ“„ Wrote label for them both: magellan_gxdr_v1.0_20240130_aip_v1.0.xml
INFO πŸ“„ Wrote SIP magellan_gxdr_v1.0_20240130_sip_v1.0.tab with 109 entries
INFO πŸ“„ Wrote label for SIP: magellan_gxdr_v1.0_20240130_sip_v1.0.xml
INFO πŸ‘‹ Thanks for using this program! Bye!
% validate-3.4.1/bin/validate --target *.xml

  FAIL: file:/Users/jpadams/proj/pds/pdsen/workspace/deep-archive/magellan_gxdr_v1.0_20240130_aip_v1.0.xml
      ERROR  [error.label.schematron]   line 27, 25: The number of colons found in lidvid_reference: (2) is inconsistent with the number expected: (5:7).
      ERROR  [error.label.schematron]   line 27, 25: The value of the attribute lidvid_reference must start with either: urn:nasa:pds: or urn:esa:psa: or urn:jaxa:darts: or urn:ros:rssa: or urn:isro:isda:
    Begin Content Validation: file:/Users/jpadams/proj/pds/pdsen/workspace/deep-archive/magellan_gxdr_v1.0_20240130_transfer_manifest_v1.0.tab
      ERROR  [error.table.missing_CRLF]   data object transfer manifest or index 1, record 1: Record does not end in carriage-return line feed.
...

  FAIL: file:/Users/jpadams/proj/pds/pdsen/workspace/deep-archive/magellan_gxdr_v1.0_20240130_sip_v1.0.xml
      ERROR  [error.label.schematron]   line 77, 25: The number of colons found in lidvid_reference: (2) is inconsistent with the number expected: (5:7).
      ERROR  [error.label.schematron]   line 77, 25: The value of the attribute lidvid_reference must start with either: urn:nasa:pds: or urn:esa:psa: or urn:jaxa:darts: or urn:ros:rssa: or urn:isro:isda:
        2 product validation(s) completed

Summary:

  113 error(s)
  0 warning(s)

  Product Validation Summary:
    0          product(s) passed
    2          product(s) failed
    0          product(s) skipped

  Referential Integrity Check Summary:
    0          check(s) passed
    0          check(s) failed
    0          check(s) skipped

  Message Types:
    109          error.table.missing_CRLF
    4            error.label.schematron

πŸ–₯ Environment Info

Mac OSx (also seen on Windows)

πŸ“š Version of Software Used

v1.1.4

🩺 Test Data / Additional context

No response

πŸ¦„ Related requirements

No response

βš™οΈ Engineering Details

No response

nutjob4life commented 7 months ago

Wow good call, thanks for noticing @gxtchen and @jordanpadams!

nutjob4life commented 6 months ago

Hi @jordanpadams, these errors:

      ERROR  [error.label.schematron]   line 27, 25: The number of colons found in lidvid_reference: (2) is inconsistent with the number expected: (5:7).
      ERROR  [error.label.schematron]   line 27, 25: The value of the attribute lidvid_reference must start with either: urn:nasa:pds: or urn:esa:psa: or urn:jaxa:darts: or urn:ros:rssa: or urn:isro:isda:

are easy enough to fix. However, when I run validate I get these errors for the file magellan_gxdr_v1.0_20240201_transfer_manifest_v1.0.tab:

  FAIL: file:/…/deep-archive/magellan_gxdr_v1.0_20240201_aip_v1.0.xml
    Begin Content Validation: file:/…/deep-archive/magellan_gxdr_v1.0_20240201_transfer_manifest_v1.0.tab
      ERROR  [error.table.missing_CRLF]   data object transfer manifest or index 1, record 1: Record does not end in carriage-return line feed.

One for each line of the file (109 lines).

However, notice this:

$ file magellan_gxdr_v1.0_20240201_transfer_manifest_v1.0.tab
magellan_gxdr_v1.0_20240201_transfer_manifest_v1.0.tab: ASCII text, with very long lines (511), with CRLF line terminators

Is Validate incorrect? Can I ignore this kind of CRLF message from Validate! Thanks in advance!

jordanpadams commented 6 months ago

@nutjob4life Can you file a ticket in validate so it sticks on my radar?

nutjob4life commented 6 months ago

@jordanpadams, done!

jordanpadams commented 6 months ago

closed per https://github.com/NASA-PDS/deep-archive/issues/158