NASA-PDS / validate

Validates PDS4 product labels, data and PDS3 Volumes
https://nasa-pds.github.io/validate/
Apache License 2.0
16 stars 11 forks source link

Validate Table_Character groups and their specified lengths match the specified group_length #63

Closed msbentley closed 4 years ago

msbentley commented 5 years ago

Describe the bug

I am validating a Table_Character using one group with 6 fields, having a total length of 69 bytes and 10 repetitions. The label wrongly specified group_length to be 699 bytes and validation was successful. The python PDS4 tool, however, flags this as invalid and will not open it.

To Reproduce

See the attached sample product - group_length is incorrectly set but validation passes. In fact any value between 690 and 699 (inclusive) results in validate passing the product, not only the correct value of 690.

validate_test_group_len.zip

Expected behavior validate should check the group_length against the total length of the fields declared in the label.

Version of Software Used 1.15.0

Applicable Requirements L5.PRP.VA.18

hhlee445 commented 4 years ago

@msbentley i'm getting a following error, when I try to fix this issue. Begin Schema: http://psa.esa.int/psa/v1/PDS4_PSA_1101.xsd FATAL_ERROR [error.label.unresolvable_resource] Server chose TLSv1, but that protocol version is not enabled or not supported by the client. End Schema: http://psa.esa.int/psa/v1/PDS4_PSA_1101.xsd Begin Schema: http://psa.esa.int/psa/bc/v1/PDS4_PSA_BC_1004.xsd FATAL_ERROR [error.label.unresolvable_resource] Server chose TLSv1, but that protocol version is not enabled or not supported by the client. End Schema: http://psa.esa.int/psa/bc/v1/PDS4_PSA_BC_1004.xsd Begin Schematron: http://psa.esa.int/psa/v1/PDS4_PSA_1101.sch FATAL_ERROR [error.label.schematron] Server chose TLSv1, but that protocol version is not enabled or not supported by the client. End Schematron: http://psa.esa.int/psa/v1/PDS4_PSA_1101.sch Begin Schematron: http://psa.esa.int/psa/bc/v1/PDS4_PSA_BC_1004.sch FATAL_ERROR [error.label.schematron] Server chose TLSv1, but that protocol version is not enabled or not supported by the client. End Schematron: http://psa.esa.int/psa/bc/v1/PDS4_PSA_BC_1004.sch

Do you have an updated test file with 'https'?

msbentley commented 4 years ago

Hi @hhlee445 I'm afraid not - we are currently in the process of upgrading the PSA server to support a higher TLS version and getting the certificate for https. In the meantime, here is a version with the PSA classes removed:

validate_test_group_len_nopsa.zip

Thanks!

jordanpadams commented 4 years ago

@hhlee445 you can specify the schemas and schematrons via the command-line for testing purposes.

hhlee445 commented 4 years ago

Added a warning message before the report generation.

WARNING: GroupFieldCharacter attribute group_length is larger than size of contained fields: 690>680

hhlee445 commented 4 years ago

Replaced a warning message to an error message.

% ./bin/validate -t /Users/hyunlee/dev/pds_en/validate/issue_63/validate_test_group_len_nopsa/isa_raw_sc_nominal_obs_00013_20190801.xml

ERROR: GroupFieldCharacter attribute group_length is larger than size of contained fields: 690>680.

hhlee445 commented 4 years ago

@jordanpadams do you want to see the exception as

java.lang.Exception: ERROR: GroupFieldCharacter attribute group_length is larger than size of contained fields: 690>680.

at gov.nasa.pds.objectAccess.table.TableCharacterAdapter.expandGroupField(TableCharacterAdapter.java:131)
at gov.nasa.pds.objectAccess.table.TableCharacterAdapter.expandFields(TableCharacterAdapter.java:65)
at gov.nasa.pds.objectAccess.table.TableCharacterAdapter.<init>(TableCharacterAdapter.java:56)
at gov.nasa.pds.objectAccess.table.AdapterFactory.getTableAdapter(AdapterFactory.java:57)
at gov.nasa.pds.objectAccess.TableReader.<init>(TableReader.java:105)
at gov.nasa.pds.tools.validate.content.table.RawTableReader.<init>(RawTableReader.java:61)
at gov.nasa.pds.tools.validate.rule.pds4.TableDataContentValidationRule.validateTableDataContents(TableDataContentValidationRule.java:191)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at gov.nasa.pds.tools.validate.rule.AbstractValidationRule.execute(AbstractValidationRule.java:63)
at org.apache.commons.chain.impl.ChainBase.execute(ChainBase.java:191)
at gov.nasa.pds.tools.validate.task.ValidationTask.execute(ValidationTask.java:134)
at gov.nasa.pds.tools.validate.task.BlockingTaskManager.submit(BlockingTaskManager.java:27)
at gov.nasa.pds.tools.label.LocationValidator.validate(LocationValidator.java:163)
at gov.nasa.pds.validate.ValidateLauncher.doValidation(ValidateLauncher.java:1226)
at gov.nasa.pds.validate.ValidateLauncher.processMain(ValidateLauncher.java:1423)
at gov.nasa.pds.validate.ValidateLauncher.main(ValidateLauncher.java:1466)

and exit before writing the report?

jordanpadams commented 4 years ago

@hhlee445 I was thinking the method would throw the error, and then the validate code would catch the exception, and add a new ValidationProblem to the report and continue on with the rest of the validation rules. Similar to other errors caught during validation, we should try to catch as many as we can in one run versus failing on one product and stopping execution.

depending on where this happens in validate, there are couple of examples around the code for creating a new ValidationProblem to work from (maybe more?):

      handler.addProblem(new ValidationProblem(
          new ProblemDefinition(ExceptionType.ERROR,
              ProblemType.MISSING_SCHEMATRON_SPEC,
              "No schematron specification found in the label."), 
          systemIdUrl));

or maybe

          getProblemHandler().addProblem(
              new ValidationProblem(
                  new ProblemDefinition(
                      ExceptionType.ERROR,
                      ProblemType.CATALOG_UNRESOLVABLE_RESOURCE,
                      io.getMessage()), 
                  baseUrl));
rchenatjpl commented 4 years ago

Validate now complains about the attached, presumably because repetitions*field_length=20 while group_length=32. However, both pds4_viewer and PDSView display the data correctly. Are the viewers behaving correctly by skipping the extra 3 bytes between repetitions? Archive.zip

rchenatjpl commented 4 years ago

Actually, I guess it's ok for the viewers to be more lenient. It seems strange that both would be lenient in a way that requires more math in figuring out what the spacing between repetitions is.

jordanpadams commented 4 years ago

@rchenatjpl they both use the same library to read the PDS4 labels, so that is most likely why they are both lenient. As long as the invalidity is valid (that made sense, right?), then I think this is OK?