khaeru / sdmx

SDMX information model and client in Python
https://sdmx1.readthedocs.io
Apache License 2.0
23 stars 17 forks source link

Validate SDMX-ML v2.1 and v3.0 messages #154

Closed goatsweater closed 5 months ago

goatsweater commented 5 months ago

Add support for validating SDMX-ML messages via sdmx.validate_xml() (Closes https://github.com/khaeru/sdmx/issues/51).

This replaces #153.

PR checklist

codecov[bot] commented 5 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (f137e51) 98.61% compared to head (b3a0921) 96.42%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #154 +/- ## ========================================== - Coverage 98.61% 96.42% -2.19% ========================================== Files 87 87 Lines 6916 7052 +136 ========================================== - Hits 6820 6800 -20 - Misses 96 252 +156 ``` | [Files](https://app.codecov.io/gh/khaeru/sdmx/pull/154?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Paul+Natsuo+Kishimoto) | Coverage Δ | | |---|---|---| | [sdmx/\_\_init\_\_.py](https://app.codecov.io/gh/khaeru/sdmx/pull/154?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Paul+Natsuo+Kishimoto#diff-c2RteC9fX2luaXRfXy5weQ==) | `100.00% <100.00%> (ø)` | | | [sdmx/format/xml/common.py](https://app.codecov.io/gh/khaeru/sdmx/pull/154?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Paul+Natsuo+Kishimoto#diff-c2RteC9mb3JtYXQveG1sL2NvbW1vbi5weQ==) | `100.00% <100.00%> (ø)` | | | [sdmx/tests/format/test\_format\_xml.py](https://app.codecov.io/gh/khaeru/sdmx/pull/154?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Paul+Natsuo+Kishimoto#diff-c2RteC90ZXN0cy9mb3JtYXQvdGVzdF9mb3JtYXRfeG1sLnB5) | `100.00% <100.00%> (ø)` | | | [sdmx/writer/xml.py](https://app.codecov.io/gh/khaeru/sdmx/pull/154?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Paul+Natsuo+Kishimoto#diff-c2RteC93cml0ZXIveG1sLnB5) | `97.68% <100.00%> (+0.01%)` | :arrow_up: | ... and [18 files with indirect coverage changes](https://app.codecov.io/gh/khaeru/sdmx/pull/154/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Paul+Natsuo+Kishimoto)
goatsweater commented 5 months ago

@khaeru I could use your help with generating XML for an unsupported message type. I have a test for an invalid message - I need to generate something that is not a message and I'm actually not sure how to do that. I tried just creating a DataSet and sending it through to_xml(), but that throws an exception and I'm not entirely sure why. I tried to copy/reuse code from test_dataset_bare.py.

Just for reference here is what I tried (inside a test function):

msg_path = tmp_path / "invalid.xml"

ds = v21.DataSet()
key = v21.Key(
        FREQ="D",
        CURRENCY="NZD",
        CURRENCY_DENOM="EUR",
        EXR_TYPE="SP00",
        EXR_SUFFIX="A",
        TIME_PERIOD="2013-01-18",
)
obs_status = v21.DataAttribute(id="OBS_STATUS")
attr = {"OBS_STATUS": v21.AttributeValue(value_for=obs_status, value="A")}
ds.obs.append(v21.Observation(dimension=key, value=1.5931, attached_attribute=attr))
key = key.copy(TIME_PERIOD="2013-01-21")
ds.obs.append(v21.Observation(dimension=key, value=1.5925, attached_attribute=attr))

msg_path.write_bytes(sdmx.to_xml(ds))

It raises this exception:

sdmx/writer/xml.py:50: in to_xml
    return etree.tostring(writer.recurse(obj), **kwargs)
sdmx/writer/base.py:53: in recurse
    return dispatcher(obj, *args, **kwargs)
/opt/homebrew/Caskroom/miniforge/base/lib/python3.10/functools.py:889: in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
sdmx/writer/xml.py:717: in _ds
    elem.append(writer.recurse(obs, struct_spec=struct_spec))
sdmx/writer/base.py:53: in recurse
    return dispatcher(obj, *args, **kwargs)
/opt/homebrew/Caskroom/miniforge/base/lib/python3.10/functools.py:889: in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
sdmx/writer/xml.py:667: in _obs
    elem.append(_kv("gen:ObsKey", obj.dimension))
khaeru commented 5 months ago

There are multiple ways to approach that:

khaeru commented 5 months ago

I see here that the first run of the "pytest" workflow fails on main with:

requests.exceptions.MissingSchema: Invalid URL 'None': No scheme supplied. Perhaps you meant https://None?

This only occurs for one of the jobs. I think I also saw this at one point during work on the PR, and the job succeeded when re-run. So this means these tests can be "flaky" in the sense that they rely on a certain expected response from the GitHub API—if that isn't given for any reason (e.g. momentary network interruption), then this line tries to use zipball_url=None: https://github.com/khaeru/sdmx/blob/e76b5201e917e5c9e46f58a56d3472c7d2a546a5/sdmx/format/xml/common.py#L207-L208

I will mitigate this by applying a pytest mark to re-run these tests as necessary until a correct response is received.