Closed jordanpadams closed 1 year ago
Status: Needs input from DDWG
@jordanpadams
This has been put into the sprint backlog for work. What is the resolution of the DDWG? Do we do 16#ef#
or 0xef
like the rest of the world or both?
bah. ok. let me check
As an affected party, I vote for the 0xef format like the rest of the world. The 16#ef# format was a holdover from the PDS3 labels, and very easy to convert to a standard format. We've been unable to use any version of validate after 3.1.1 due to this issue.
@kbowley-asu
Do you have a small example that I could use to understand the full extent of what you requesting - is it that you can no longer put 16#ef#
into the XML or that validate is no longer processing it well or a bit of both.
Our PDS3 labels have the funky 16#@..# formatted values, so originally we had those in our PDS4 labels, but after validate broke being able to validate anything, we looked closer and were more than happy to change to the standard format. Here's an example from what we are currently generating:
<Array_2D_Image>
<local_identifier>Array_2D_Image</local_identifier>
<offset unit="byte">10560</offset>
<axes>2</axes>
<axis_index_order>Last Index Fastest</axis_index_order>
<Element_Array>
<data_type>IEEE754LSBSingle</data_type>
<unit>I/F</unit>
</Element_Array>
<Axis_Array>
<axis_name>Line</axis_name>
<elements>20748</elements>
<sequence_number>1</sequence_number>
</Axis_Array>
<Axis_Array>
<axis_name>Sample</axis_name>
<elements>704</elements>
<sequence_number>2</sequence_number>
</Axis_Array>
<Special_Constants>
<missing_constant>0xFF7FFFFB</missing_constant>
<high_instrument_saturation>0xFF7FFFFE</high_instrument_saturation>
<high_representation_saturation>0xFF7FFFFF</high_representation_saturation>
<valid_minimum>0xFF7FFFFA</valid_minimum>
<low_instrument_saturation>0xFF7FFFFD</low_instrument_saturation>
<low_representation_saturation>0xFF7FFFFC</low_representation_saturation>
</Special_Constants>
</Array_2D_Image>
@kbowley-asu
If you use a validating XML editor with the PDS4 schema, does this XML with the 0xef
style validate? Does it also pass the PDS4 schematron? If you do not understand either of those questions, then a product example along with referenced files would be nice. Keeping it to a few 10s of KB would be nice too but I will take whatever I can get right now.
I understand those references, but have not run any of the labels through any other validation (besides general xml parsing engines like nokogiri), and I use vim (with syntax highlighting) for editing. The image for the label I pulled that example out of is only 56M, so relatively very small. https://pds.lroc.asu.edu/data/LRO-L-LROC-3-CDR-V1.0/LROLRC_1054C/DATA/ESM5/2023047/WAC/M1431146123CC.xml and https://pds.lroc.asu.edu/data/LRO-L-LROC-3-CDR-V1.0/LROLRC_1054C/DATA/ESM5/2023047/WAC/M1431146123CC.IMG
@jordanpadams @kbowley-asu
I see. Without validate.sh content validation it passes which means that the syntax is legal from the schema and schematron standpoint. Since we are not supposed to check what the schema/schematron already checks then I am going to add the processing for 0x now and if it becomes illegal from schema/schematron changes then what to do will become clear.
Yes... I guess I should have clarified that validate is perfectly happy if I use the --skip-content-validation option.
@al-niessner sounds good! Thanks!
I have a working fix for this example. It can be expanded quickly if not all primitive types are covered.
@jordanpadams @kbowley-asu
Okay, my mistake. Fix is horrible. It is totally susceptible to bit rounding but not how we think I am getting two numbers that seem the same but are not. The minimum is: 0xFF7FFFFA and after reading the data in from the image it is showing as 0xFF7FFFFB which, as it turns out, less than the limit. So, bit patterns may not be all that helpful. There are lots of such points in the file you referenced earlier today. @kbowley-asu can you verify (I like independent verification) that there are or are not floats in the file that are 0xFF7FFFFB?
0xFF7FFFFB is listed as the missing_constant in the Special_Constants. My understanding is that anything below valid_minimum is not part of the valid data and should be defined by one of the Special_Constants.
@kbowley-asu thank you for confirming existence. Was able to fix the problem and your images once again pass along with our regression tests that check other aspects of the special constants.
Checked for duplicates
No - I haven't checked
🧑🔬 User Persona(s)
Data Engineer, Data Provider
💪 Motivation
...so that I can use bit patterns in most (it not all) Special_Constants attributes
📖 Additional Details
No response
Acceptance Criteria
Given When I perform Then I expect
⚙️ Engineering Details
Related to discussion starting here: https://github.com/NASA-PDS/validate/issues/611#issuecomment-1555457236