NASA-IMPACT / pyQuARC

The pyQuARC tool reads and evaluates metadata records with a focus on the consistency and robustness of the metadata. pyQuARC flags opportunities to improve or add to contextual metadata information in order to help the user connect to relevant data products. pyQuARC also ensures that information common to both the data product and the file-level metadata are consistent and compatible. pyQuARC frees up human evaluators to make more sophisticated assessments such as whether an abstract accurately describes the data and provides the correct contextual information. The base pyQuARC package assesses descriptive metadata used to catalog Earth observation data products and files. As open source software, pyQuARC can be adapted and customized by data providers to allow for quality checks that evolve with their needs, including checking metadata not covered in base package.
Apache License 2.0
19 stars 0 forks source link

DIF10: Access_Constraints and Use_Constraints throw whitespace error #171

Closed jenny-m-wood closed 11 months ago

jenny-m-wood commented 2 years ago

Describe the bug pyQuARC produces an error for DIF10/Access_Constraints and DIF10/Use_Constraints when they contain anything other than white space.

To Reproduce Steps to reproduce the behavior:

  1. Run the following command: python3 main.py --format dif10 --concept_ids C1236350906-GES_DISC
  2. See error: "Error: Character content other than whitespace is not allowed because the content type is 'element-only'." for DIF10/Access_Constraints and DIF10/Use_Constraints

Expected behavior Did not expect to see error

Additional context Similar to a bug from ECHO10 checks: https://github.com/NASA-IMPACT/pyQuARC/issues/71

xhagrg commented 2 years ago

@jenny-m-wood what version of the package are you using?

jenny-m-wood commented 2 years ago

@xhagrg Version 1.1.4. This bug was found after integrating pyQuARC into the dashboard. I also encountered this issue when I ran version 1.1.4 on my local machine using the dif10 test file.

Jeanne-le-Roux commented 2 years ago

One reason this could be happening for Use_Constraints; the DIF10.2 schema has been updated to include a number of sub-elements under 'Use_Constraints': https://git.earthdata.nasa.gov/projects/EMFD/repos/dif-schemas/browse/10.x/dif_v10.2.xsd#1552

jenny-m-wood commented 1 year ago

@slesaad @xhagrg Update on this - this is still an issue. Our team is not sure how to resolve it.

slesaad commented 1 year ago

It looks like it's the correct behavior.

The dif10 schema file has this:

<xs:complexType name="AccessConstraintsType">
        <xs:annotation>
            <xs:appinfo><details></details></xs:appinfo>
            <xs:appinfo><action>none</action></xs:appinfo>            
            <xs:documentation>

            | DIF 9               | ECHO 10 | UMM               | DIF 10             | Notes          |
            | ------------------- | ------- | ------------------| ------------------ | -------------  |
            | Access_Constraints  |    -    | AccessConstraints | Access_Constraints | No change      |

            </xs:documentation>
            <xs:appinfo><action>changed</action>
                <src>CMRSCI-2700</src>
                <since>10.3</since>
                <note>Added fields to Support Access Control</note>
                <xs:documentation>
                    * = New DIF Field 
                    x = DIF field renamed

                    | DIF 10.2                      | DIF-10.3                                               | UMM                                | Notes       |
                    | ----------------------------- | -------------------------------------------------------|--------------------------------------------------| 
                    | Access_Constraints (0..1)     | Access_Constraints (0..1)                              | AccessConstraints (0..1)           |             |
                    | Access_Constraints/Text Block | * Access_Constraints/Description (1)                   | AccessConstraints/Description (1)  |             |
                    | N/A                           | * Access_Constraints/Access_Control (0..1)             | AccessConstraints/Value (0..1)     |             |          
                    | N/A                           | * Access_Constraints/Access_Control_Description (0..1) | N/A                                |             |
                </xs:documentation>
            </xs:appinfo>
        </xs:annotation>
        <xs:sequence>
        <xs:choice>
            <xs:element name="Description" minOccurs="0">
                <xs:annotation>
                    <xs:documentation>This sub-element is a free-text description that details access constraints of this collection.</xs:documentation>
                </xs:annotation>
                <xs:simpleType>
                    <xs:restriction base="xs:string">
                        <xs:minLength value="1" />
                        <xs:maxLength value="4000" />
                    </xs:restriction>
                </xs:simpleType>
            </xs:element>
            <xs:element name="Access_Control" minOccurs="0">
                <xs:simpleType> 
                    <xs:restriction base="xs:integer"> 
                        <xs:minInclusive value="0"/> 
                        <xs:maxInclusive value="255"/> 
                    </xs:restriction> 
                </xs:simpleType>
            </xs:element>    
        </xs:choice>
        <xs:element name="Access_Control_Description" minOccurs="0">
            <xs:annotation>
                   <xs:documentation>This sub-element is a free-text description that details the Access Control.</xs:documentation>
               </xs:annotation>
               <xs:simpleType>
                   <xs:restriction base="xs:string">
                         <xs:minLength value="1" />
                         <xs:maxLength value="4000" />
                   </xs:restriction>
               </xs:simpleType>
           </xs:element>
          </xs:sequence>
    </xs:complexType>

    <!-- *********************************************************** -->
    <!-- #mark Use_Constraints -->

    <xs:complexType name="UseConstraintsType">
        <xs:annotation>
            <xs:appinfo><details></details></xs:appinfo>
            <xs:appinfo><action>changed</action><src>UMM (DIF)</src></xs:appinfo>
            <xs:documentation>

                | ECHO 10 | UMM               | DIF 10             | Notes          |
                | ------- | ------------------| ------------------ | -------------  |
                |    -    | UseConstraints    | Use_Constraints    | No change      |

            </xs:documentation>
            <xs:appinfo><action>changed</action>
                <src>CMRSCI-2700, ECSE-171</src>
                <since>10.3</since>
                <note>Added fields to Support licensing elements in UMM Models</note>
                <xs:documentation>
                    * = New DIF Field 
                    x = DIF field renamed

                    | DIF 10.2                   | DIF-10.3                                               | UMM                                                  | Notes       |
                    | ---------------------------| -------------------------------------------------------|------------------------------------------------------| ------------|
                    | Use_Constraints (0..1)     | Use_Constraints (0..1)                                 | UseConstraints (0..1)                                |             |
                    | Use_Constraints/Text Block | * Use_Constraints/Description (0..1)                   | UseConstraints/Description (0..1)                    |             |
                    | N/A                        | * Use_Constraints/License_Text (0..1)                  | UseConstraints/LicenseText (0..1)                    |             |
                    | N/A                        | * Use_Constraints/License_URL (0..1)                   | UseConstraints/LicenseURL (0..1)                     |             |
                    | N/A                        | * Use_Constraints/License_URL/URL (1..*)               | UseConstraints/LicenseURL/Linkage (1)                |             |
                    | N/A                        | * Use_Constraints/License_URL/Protocol (0..1)          | UseConstraints/LicenseURL/Protocol (0..1)            |             |
                    | N/A                        | * Use_Constraints/LicenseURL/ApplicationProfile (0..1) | UseConstraints/LicenseURL/ApplicationProfile (0..1)  |             |
                    | N/A                        | * Use_Constraints/License_URL/Title (0..1)             | UseConstraints/LicenseURL/Name (0..1)                |             |
                    | N/A                        | * Use_Constraints/License_URL/Description (0..1)       | UseConstraints/Description (0..1)                    |             |
                    | N/A                        | * Use_Constraints/Function (0..1)                      | UseConstraints/Function (0..1)                       |             |
                    | N/A                        | * Use_Constraints/License_URL/Mime_Type (0..*)         | UseConstraints/MimeType (0..*)                       |             |
                </xs:documentation>
            </xs:appinfo>
        </xs:annotation>
        <xs:sequence>
            <xs:element name="Description" minOccurs="0">
                <xs:annotation>
                    <xs:documentation>This sub-element either contains a license summary or free-text description that details the permitted use or limitation of this collection.</xs:documentation>
                </xs:annotation>
                <xs:simpleType>
                    <xs:restriction base="xs:string">
                        <xs:minLength value="1" />
                        <xs:maxLength value="4000" />
                    </xs:restriction>
                </xs:simpleType>
            </xs:element>
            <xs:choice>
                <xs:element name="License_URL" type="RelatedURLType"/>
                <xs:element name="License_Text" minOccurs="0">
                    <xs:annotation>
                        <xs:documentation>This element holds the actual license text. If this element is used the LicenseUrl element cannot be used.</xs:documentation>
                    </xs:annotation>
                    <xs:simpleType>
                        <xs:restriction base="xs:string">
                            <xs:minLength value="1" />
                            <xs:maxLength value="20000" />
                        </xs:restriction>
                    </xs:simpleType>
                </xs:element>
            </xs:choice>
        </xs:sequence>
        <xs:attribute type="DisplayableTextEnum" name="mime_type" default="text/markdown"/>
    </xs:complexType>

Seems like both of these fields have child fields like Description, etc. Does make sense that it doesn't accept any value; it should either be empty or have child field/values.

@jenny-m-wood no?