pds-data-dictionaries / PDS4-LDD-Issue-Repo

Issue repository for tracking all PDS4 Discipline Dictionary-related issues, new feature requests, and releases.
Apache License 2.0
2 stars 1 forks source link

[ldd-spectral] Consider units when comparing magnitudes of `sp:first_center_*` and `sp:last_center_*`? #275

Closed cgobat closed 5 months ago

cgobat commented 10 months ago

Describe the issue identified (if applicable) There is a schematron rule that mandates

In the Uniformly_Sampled classes, the first_center value must be less than the last_center value.

However, consider the following case:

<sp:Bin_Description>
    <sp:bin_profile_description>The spectral bins span wavelengths 451.30 nm to approximately 1000 nm.</sp:bin_profile_description>
    <sp:Uniformly_Sampled_Wavelength>
        <sp:axis_name>Spectral axis</sp:axis_name>
        <sp:sampling_interval_wavelength unit="nm">2.0</sp:sampling_interval_wavelength>
        <sp:sampling_scale>Linear</sp:sampling_scale>
        <sp:bin_width_wavelength unit="Angstrom">9.6</sp:bin_width_wavelength>
        <sp:first_center_wavelength unit="Angstrom">4513.0</sp:first_center_wavelength>
        <sp:last_center_wavelength unit="micrometer">1</sp:last_center_wavelength>
    </sp:Uniformly_Sampled_Wavelength>
</sp:Bin_Description>

Physically speaking, this should satisfy the rule (4513 Å < 1 μm), but when units are not ignored, obviously 4513 ≮ 1, so this fails validation.

The same applies to the energy, frequency, and wavenumber versions of these classes/attributes too.

Describe the solution you'd like I recognize that implementing a fully-fledged unit conversion engine within a schematron rule may not be practical. At the same time, however, only a finite number of unit names are valid for these attributes, and within each sp:Uniformly_Sampled_* type, all possible units are simply metric multiples of one another, so it may not be too insurmountable after all.

Another functional alternative that would likely be much simpler would be to "ignore" this check if the units of the two fields differ at all, and maybe emit a warning to make sure users are aware of the situation and know what they're doing.

Describe alternatives you've considered This could remain unchanged, in which case the units of the two fields must match in order for this rule to work properly.

LDD Dictionary Version 1.3.1.1 (latest)

PDS4 IM Version N/A (all)

matthewtiscareno commented 10 months ago

This issue only arises if first_center_wavelength and last_center_wavelength have different units, correct?

Is it allowed (and/or good practice) to do that?

cgobat commented 10 months ago

Yep, that's right. That's the point of this issue.

Differing units are allowed, at least in the sense that there doesn't seem to be any rule in the schema/schematron against it. The message I quoted at the top of the issue is all that validate has to say when you give it something like the example snippet above.

It may or may not be good practice, but that's a different conversation.

matthewtiscareno commented 10 months ago

So an obvious solution is for the data provider to choose one unit for a given quantity and stick with it. Changing units as you suggest may not be forbidden, but EN might legitimately balk at expending effort to make it work in Validate — unless there is a strong statement of need from a data provider. Do you have such?

acraugh commented 8 months ago

Practically speaking, the effort required to code the test with differing units in Schematron is ridiculously expensive, and the result is likely to mislead both human users and programs. So the only reasonable alternative is to code a test to require that corresponding min/max values have the same unit of measure, which is straight-up character comparison that is relatively cheap to execute. I'll add this to my "Dictionary Clean-up" list.

acraugh commented 5 months ago

The proposed fix was implements in the 1.3.2.0 version of the LDD release with IM 1.22.0.0. I forget to note it and then did not successfully guess how to add it, (https://github.com/pds-data-dictionaries/ldd-spectral/pull/37)