khaeru / sdmx

SDMX information model and client in Python
https://sdmx1.readthedocs.io
Apache License 2.0
24 stars 18 forks source link

Implement SDMX-REST API v2.1.0 (~SDMX 3.0) #158

Closed khaeru closed 7 months ago

khaeru commented 8 months ago

This will allow queries against data sources that support only the SDMX 3.0.0 API, or both the 2.1 and 3.0.0 APIs.

Partly addresses #87.

Housekeeping:

PR checklist

codecov[bot] commented 8 months ago

Codecov Report

Attention: 15 lines in your changes are missing coverage. Please review.

Comparison is base (31a7ba1) 98.73% compared to head (58913f7) 97.81%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #158 +/- ## ========================================== - Coverage 98.73% 97.81% -0.93% ========================================== Files 90 94 +4 Lines 7128 7415 +287 ========================================== + Hits 7038 7253 +215 - Misses 90 162 +72 ``` | [Files](https://app.codecov.io/gh/khaeru/sdmx/pull/158?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Paul+Natsuo+Kishimoto) | Coverage Δ | | |---|---|---| | [sdmx/format/xml/common.py](https://app.codecov.io/gh/khaeru/sdmx/pull/158?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Paul+Natsuo+Kishimoto#diff-c2RteC9mb3JtYXQveG1sL2NvbW1vbi5weQ==) | `100.00% <ø> (ø)` | | | [sdmx/reader/xml/v21.py](https://app.codecov.io/gh/khaeru/sdmx/pull/158?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Paul+Natsuo+Kishimoto#diff-c2RteC9yZWFkZXIveG1sL3YyMS5weQ==) | `99.16% <100.00%> (+0.11%)` | :arrow_up: | | [sdmx/reader/xml/v30.py](https://app.codecov.io/gh/khaeru/sdmx/pull/158?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Paul+Natsuo+Kishimoto#diff-c2RteC9yZWFkZXIveG1sL3YzMC5weQ==) | `95.19% <100.00%> (-0.05%)` | :arrow_down: | | [sdmx/rest/\_\_init\_\_.py](https://app.codecov.io/gh/khaeru/sdmx/pull/158?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Paul+Natsuo+Kishimoto#diff-c2RteC9yZXN0L19faW5pdF9fLnB5) | `100.00% <100.00%> (ø)` | | | [sdmx/rest/common.py](https://app.codecov.io/gh/khaeru/sdmx/pull/158?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Paul+Natsuo+Kishimoto#diff-c2RteC9yZXN0L2NvbW1vbi5weQ==) | `100.00% <100.00%> (ø)` | | | [sdmx/rest/v21.py](https://app.codecov.io/gh/khaeru/sdmx/pull/158?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Paul+Natsuo+Kishimoto#diff-c2RteC9yZXN0L3YyMS5weQ==) | `100.00% <100.00%> (ø)` | | | [sdmx/source/\_\_init\_\_.py](https://app.codecov.io/gh/khaeru/sdmx/pull/158?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Paul+Natsuo+Kishimoto#diff-c2RteC9zb3VyY2UvX19pbml0X18ucHk=) | `100.00% <100.00%> (ø)` | | | [sdmx/source/estat.py](https://app.codecov.io/gh/khaeru/sdmx/pull/158?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Paul+Natsuo+Kishimoto#diff-c2RteC9zb3VyY2UvZXN0YXQucHk=) | `93.44% <100.00%> (ø)` | | | [sdmx/testing/\_\_init\_\_.py](https://app.codecov.io/gh/khaeru/sdmx/pull/158?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Paul+Natsuo+Kishimoto#diff-c2RteC90ZXN0aW5nL19faW5pdF9fLnB5) | `99.34% <100.00%> (+0.06%)` | :arrow_up: | | [sdmx/tests/reader/test\_reader\_xml.py](https://app.codecov.io/gh/khaeru/sdmx/pull/158?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Paul+Natsuo+Kishimoto#diff-c2RteC90ZXN0cy9yZWFkZXIvdGVzdF9yZWFkZXJfeG1sLnB5) | `100.00% <100.00%> (ø)` | | | ... and [12 more](https://app.codecov.io/gh/khaeru/sdmx/pull/158?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Paul+Natsuo+Kishimoto) | | ... and [12 files with indirect coverage changes](https://app.codecov.io/gh/khaeru/sdmx/pull/158/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Paul+Natsuo+Kishimoto)
khaeru commented 7 months ago

As part of this, I noticed that ESTAT claims to provide a 3.0 REST entry-point and filed the following ticket with their support system. (This is unfortunately closed and opaque, so I cannot link to it and copy it here.)

I am trying to retrieve and process data in SDMX-ML 3.0 format from the API documented at https://wikis.ec.europa.eu/display/EUROSTATHELP/API+-+Getting+started+with+SDMX3.0+API

I use one of the example queries, as follows:

curl "https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/structure/codelist/ESTAT/FREQ/+?compress=false" -o response.xml

…this results in a local file response.xml. I show a portion of the file here:

<s:Code id="A" urn="urn:sdmx:org.sdmx.infomodel.codelist.Code=ESTAT:FREQ(3.2).A">
  <c:Annotations>Y</c:Annotations>
  <c:AnnotationType>IS_STANDARD_CODE</c:AnnotationType>
  <c:AnnotationText xml:lang="en">Standard code, follows Eurostat Standard Code List Guidelines</c:AnnotationText>
  <c:Name xml:lang="en">Annual</c:Name>
  <c:Name xml:lang="de">Jährlich</c:Name>
  <c:Name xml:lang="fr">Annuel</c:Name>
</s:Code>

This appears to be invalid SDMX-ML. Among other problems:

  • The tag should contain only zero or more tags; here it contains none.
  • The tag should not contain text; here it contains the text "Y".
  • The and tags should appear only inside ; here they appear directly inside .

Earlier in the same message there are some correctly formatted annotations (attached to the Codelist object, rather than the Code object):

<s:Codelist agencyID="ESTAT" id="FREQ" urn="urn:sdmx:org.sdmx.infomodel.codelist.Codelist=ESTAT:FREQ(3.2)" version="3.2">
  <c:Annotations>
    <c:Annotation>
      <c:AnnotationTitle>2023-10-12T23:00:00+0200</c:AnnotationTitle>
      <c:AnnotationType>LAST_UPDATED</c:AnnotationType>
    </c:Annotation>
    <c:Annotation>
      <c:AnnotationTitle>Y</c:AnnotationTitle>
      <c:AnnotationType>IS_STANDARD_CODE_LIST</c:AnnotationType>
      <c:AnnotationText xml:lang="en">Standard code list</c:AnnotationText>
    </c:Annotation>
     …
  </c:Annotations>
  …
</s:Codelist>

…so it appears the malformed XML is not everywhere; only in some places. My specific questions are:

  1. Is this issue already known?
  2. Where are currently known issues of the ESTAT SDMX 3.0 REST web service, including this or any others, documented?
  3. Which content or queries are affected (or unaffected) by this issue?
  4. Is there a rough time frame when it will be fixed?

The only reply I received, after 3 weeks, was:

The team have responded to state that the issue has now been resolved.

It appears per (2) that there is no public quirks document—again, unfortunate—but the resulting SDMX-ML does now appear valid, so can be used for testing this PR.

khaeru commented 7 months ago

The codecov/patch check fail here is expected and acceptable: some modified lines are in test_sources, but these are not run for ordinary PRs, only on 'push' and 'schedule' triggers.