khaeru / sdmx

SDMX information model and client in Python
https://sdmx1.readthedocs.io
Apache License 2.0
23 stars 17 forks source link

Some metadata missing on codelists parsed from SDMX-ML #142

Closed goatsweater closed 8 months ago

goatsweater commented 9 months ago

I'm finding that when inspecting codelists from messages the annotations, valid to, and valid from dates are all empty. The services I'm working with are internal, so to demonstrate I've created a small sample message and code. Normally these messages originate from an internal deployment of iStat or .Stat.

>>> import sdmx

>>> codelist_msg = sdmx.read_sdmx("cl_sample.xml")
>>> codelist_msg
<sdmx.StructureMessage>
  <Header>
    id: 'IDREF652'
    prepared: '2023-06-13T13:16:08.604607'
    receiver: <Agency unknown>
    sender: <Agency TEST>
    source: 
    test: False
  Codelist (1): CL_NAICS
>>> cl = codelist_msg.codelist[0]
>>> cl
<Codelist TEST:CL_NAICS(1.0) (1 items): North American Product Classification System>
>>> cl.annotations
[]
>>> cl.valid_from  # None value
>>> cl.valid_to  # None value

This reads the below message, which has multiple annotations and valid to/from dates:

<?xml version='1.0' encoding='utf-8'?>
<mes:Structure xmlns:com="http://www.sdmx.org/resources/sdmxml/schemas/v2_1/common"
    xmlns:data="http://www.sdmx.org/resources/sdmxml/schemas/v2_1/data/structurespecific"
    xmlns:str="http://www.sdmx.org/resources/sdmxml/schemas/v2_1/structure"
    xmlns:mes="http://www.sdmx.org/resources/sdmxml/schemas/v2_1/message"
    xmlns:gen="http://www.sdmx.org/resources/sdmxml/schemas/v2_1/data/generic"
    xmlns:footer="http://www.sdmx.org/resources/sdmxml/schemas/v2_1/message/footer"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <mes:Header>
        <mes:ID>IDREF652</mes:ID>
        <mes:Test>false</mes:Test>
        <mes:Prepared>2023-06-13T13:16:08.604607</mes:Prepared>
        <mes:Sender id="TEST"/>
        <mes:Receiver id="unknown"/>
    </mes:Header>
    <mes:Structures>
        <str:Codelists>
            <str:Codelist version="1.0" isExternalReference="false" isFinal="false" validFrom="2021-01-24T08:00:00" validTo="2021-09-24T08:00:00" agencyID="TEST" id="CL_NAICS" urn="urn:sdmx:org.sdmx.infomodel.codelist.Codelist=TEST:CL_NAICS(1.0)">
                <com:Annotations>
                    <com:Annotation id="status">
                        <com:AnnotationText xml:lang="en">Released</com:AnnotationText>
                    </com:Annotation>
                    <com:Annotation id="audience">
                        <com:AnnotationText xml:lang="en">Standardized</com:AnnotationText>
                    </com:Annotation>
                    <com:Annotation id="abbreviation">
                        <com:AnnotationText xml:lang="en">NAICS</com:AnnotationText>
                    </com:Annotation>
                </com:Annotations>
                <com:Name xml:lang="en">North American Product Classification System</com:Name>
                <str:Code id="1">
                    <com:Name xml:lang="en">Test code</com:Name>
                    <com:Description xml:lang="en">A sample code.</com:Description>
                </str:Code>
            </str:Codelist>
        </str:Codelists>
    </mes:Structures>
</mes:Structure>

I was expecting all of these attributes to contain values as defined in the message.

khaeru commented 9 months ago

Thanks for reporting this, especially for producing a specimen.

I would guess 2 different causes:

I'll add a test and fix.

goatsweater commented 9 months ago

I never thought to check the first code. I just ran a full codelist and checked the first code for the top-level annotations and you are correct the annotations are ending up on the first code in the codelist.

khaeru commented 8 months ago

v2.12.0 contains the fix. Thanks again—please report any other bugs you may discover.