NASA-PDS / validate

Validates PDS4 product labels, data and PDS3 Volumes
https://nasa-pds.github.io/validate/
Apache License 2.0
16 stars 11 forks source link

Duplicated warning.label.schema messages #946

Open rgdeen opened 3 months ago

rgdeen commented 3 months ago

Checked for duplicates

No - I haven't checked

🐛 Describe the bug

I was (re)validating the m20 release 9 bundle mars2020_cachecam_ops_raw for testing. Due to a bug in the velocity templates, all the products have this error, appropriately tagged as a warning:

        {
          "severity": "WARNING",
          "type": "warning.label.schema",
          "message": "The schema version(s) [/PDS4//MSN/V1/PDS4_MSN_1G00_1300] does/do not match the schematron version(s) [/PDS4/MSN/V1/PDS4_MSN_1G00_1300]."
        },

(notes the double slash between PDS4 and MSN).

However... I am getting hundreds of these identical messages per file!!

The summary at the end shows the appropriate number of warnings:

  "summary": {
    "totalProducts": 1611,
    "totalErrors": 0,
    "totalWarnings": 804,
    "productValidation": {
      "passed": "1611",
      "failed": "0",
      "skipped": "0",
      "total": "1611"
    },
    "referentialIntegrity": {
      "passed": "1611",
      "failed": "0",
      "skipped": "0",
      "total": "1611"
    },
    "messageTypes": [
      {
        "messageType": "warning.label.schema",
        "total": "804"
      }
    ]

However, wc shows 2 million warnings:

$ grep WARNING ../../testing/mars2020/release9/bundleval/bundle_val_mars2020_cachecam_ops_raw.json | wc
2266075 4532150 74780474

Actually the number of warning may or may not be appropriate, there are half as many (804) listed as there are products (1611). I don't know if that means some did not have the warnings, it's too hard to search this log!

Command line:

$VALIDATE --target $WORKDIR_BUNDLE/${bundle} --report-file $val_log -R pds4.bundle --skip-content-validation --skip-product-validation -s json

Oddly, if I validate just one file on its own, I get just one warning.

Ooohhhhhhh.... here's a clue. I gave it a directory of 57 files and it did this:

$ egrep "WARNING|PASS" /tmp/x.x
    "severityLevel": "WARNING",
      "status": "PASS",
          "severity": "WARNING",
      "status": "PASS",
          "severity": "WARNING",
          "severity": "WARNING",
      "status": "PASS",
          "severity": "WARNING",
          "severity": "WARNING",
          "severity": "WARNING",
      "status": "PASS",
          "severity": "WARNING",
          "severity": "WARNING",
          "severity": "WARNING",
          "severity": "WARNING",
      "status": "PASS",
          "severity": "WARNING",
          "severity": "WARNING",
          "severity": "WARNING",
          "severity": "WARNING",
          "severity": "WARNING",
      "status": "PASS",
          "severity": "WARNING",
          "severity": "WARNING",
          "severity": "WARNING",
          "severity": "WARNING",
          "severity": "WARNING",
          "severity": "WARNING",
      "status": "PASS",
...

so the number of warnings it prints is tied to the number of files it has processed.

🕵️ Expected behavior

I expected one warning per file :-)

📜 To Reproduce

Discussed above. You can probably download the r9 labels, or contact me privately and I'll set you up with some.

🖥 Environment Info

linux

📚 Version of Software Used

validate 3.5.1

🩺 Test Data / Additional context

No response

🦄 Related requirements

🦄 #xyz

⚙️ Engineering Details

No response

🎉 Integration & Test

No response

rgdeen commented 3 months ago

The 800 vs 1600 thing is because the error only exists in the images, not the browses. So 800 warnings its the appropriate number.

jordanpadams commented 3 months ago

@rgdeen I will add this to the backlog.