dr-leo / pandaSDMX

Python interface to SDMX
Apache License 2.0
129 stars 60 forks source link

DSD descriptors empty #236

Open maxkoe opened 2 years ago

maxkoe commented 2 years ago

Hello,

I am trying to follow the walkthrough documentation. Right now I bump into the issue that the DSD descriptors are unexpectedly empty:

import pandasdmx as sdmx

ecb = sdmx.Request('ECB')
ecb_flow = ecb.dataflow()
#display(ecb_flow.dataflow)
_dataflows = sdmx.to_pandas(ecb_flow.dataflow)
display(_dataflows[_dataflows.str.contains('exchange', case=False)])

exr = ecb_flow.dataflow.EXR
display(exr)
dsd = exr.structure
display(dsd.__dict__)
display(dsd.dimensions.__dict__)

produces the following output:

EXR                              Exchange Rates
FXI                 Foreign Exchange Statistics
SEE    Securities exchange - Trading Statistics
dtype: object
<DataflowDefinition ECB:EXR(1.0): Exchange Rates>
{'annotations': [],
 'id': 'ECB_EXR1',
 'uri': None,
 'urn': 'urn:sdmx:org.sdmx.infomodel.datastructure.DataStructureDefinition=ECB:ECB_EXR1(1.0)',
 'urn_group': {},
 'name': ,
 'description': ,
 'version': '1.0',
 'valid_from': None,
 'valid_to': None,
 'is_final': False,
 'is_external_reference': True,
 'service_url': None,
 'structure_url': None,
 'maintainer': <Agency ECB>,
 'grouping': None,
 'attributes': <AttributeDescriptor: >,
 'dimensions': <DimensionDescriptor: >,
 'measures': <MeasureDescriptor: >,
 'group_dimensions': {}}
{'annotations': [],
 'id': '',
 'uri': None,
 'urn': None,
 'urn_group': {},
 'components': [],
 'auto_order': 1}

In the last output the components are empts, I would expect to find something there.

I am working with pandaSDMX version 1.9.0.

dr-leo commented 2 years ago

This behavior is correct. If you want to see the full data structure definition, request a specific data flow or the data flow definition directly. This is described in the walk-through as well. Don’t hesitate to get back to me if this doesn’t work.

Am 03.08.2022 um 11:16 schrieb Maximilian König @.***>:

 Hello,

I am trying to follow the walkthrough documentation. Right now I bump into the issue that the DSD descriptors are unexpectedly empty:

import pandasdmx as sdmx

ecb = sdmx.Request('ECB') ecb_flow = ecb.dataflow()

display(ecb_flow.dataflow)

_dataflows = sdmx.to_pandas(ecb_flow.dataflow) display(_dataflows[_dataflows.str.contains('exchange', case=False)])

exr = ecb_flow.dataflow.EXR display(exr) dsd = exr.structure display(dsd.dict) display(dsd.dimensions.dict) produces the following output:

EXR Exchange Rates FXI Foreign Exchange Statistics SEE Securities exchange - Trading Statistics dtype: object <DataflowDefinition ECB:EXR(1.0): Exchange Rates> {'annotations': [], 'id': 'ECB_EXR1', 'uri': None, 'urn': 'urn:sdmx:org.sdmx.infomodel.datastructure.DataStructureDefinition=ECB:ECB_EXR1(1.0)', 'urn_group': {}, 'name': , 'description': , 'version': '1.0', 'valid_from': None, 'valid_to': None, 'is_final': False, 'is_external_reference': True, 'service_url': None, 'structure_url': None, 'maintainer': , 'grouping': None, 'attributes': , 'dimensions': , 'measures': , 'group_dimensions': {}} {'annotations': [], 'id': '', 'uri': None, 'urn': None, 'urn_group': {}, 'components': [], 'auto_order': 1} I am working with pandaSDMX version 1.9.0.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.

maxkoe commented 2 years ago

Thank you for comming back to me. After going through the walkthrough line by line again, I came to the following code, that works.

import pandasdmx as sdmx

ecb = sdmx.Request('ECB')
ecb_flow = ecb.dataflow()

_dataflows = sdmx.to_pandas(ecb_flow.dataflow)
display(_dataflows[_dataflows.str.contains('exchange', case=False)])

exr = ecb.dataflow('EXR')
display(exr)
exr_flow = exr.dataflow.EXR
dsd = exr_flow.structure
display(dsd.__dict__)
display(dsd.dimensions.components)

To be honest, I do not understand the reasoning behind the order of operations at all. In particular the lines

exr = ecb.dataflow('EXR')
exr_flow = exr.dataflow.EXR
dsd = exr_flow.structure

are befuddling to me.

maxkoe commented 2 years ago

Using the BBK-API also does not seem to work in the same way:

import pandasdmx as sdmx

ecb = sdmx.Request('ECB')
exr = ecb.dataflow('EXR')
exr_flow = exr.dataflow.EXR
dsd = exr_flow.structure
display(dsd.dimensions.components)

bbk = sdmx.Request('BBK')
bbex3 = bbk.dataflow('BBEX3')
bbex3_flow = bbex3.dataflow.BBEX3
dsd = bbex3_flow.structure
display(dsd.dimensions.components)

produces the following unsatisfactory output:

[<Dimension FREQ>,
 <Dimension CURRENCY>,
 <Dimension CURRENCY_DENOM>,
 <Dimension EXR_TYPE>,
 <Dimension EXR_SUFFIX>,
 <TimeDimension TIME_PERIOD>]
[]
maxkoe commented 2 years ago

OK, for Bundesbank, the correct modus operandi is:

import pandasdmx as sdmx

bbk = sdmx.Request('BBK')
bbk_erx = bbk.datastructure('BBK_ERX')
dsd = bbk_erx.structure.BBK_ERX
display(dsd.dimensions.components)

This works for now. The result is:

[<Dimension BBK_STD_FREQ>,
 <Dimension BBK_STD_CURRENCY>,
 <Dimension BBK_ERX_PARTNER_CURRENCY>,
 <Dimension BBK_ERX_SERIES_TYPE>,
 <Dimension BBK_ERX_RATE_TYPE>,
 <Dimension BBK_ERX_SUFFIX>,
 <TimeDimension TIME_PERIOD>]
dr-leo commented 2 years ago

Briefly on your last three lines of code: the first line requests structure a message from the ECB web service. It just makes an http://request and if all goes well, you will receive an XML file which is then rendered as a structure message object. The second line g gives you the data flow object. You will quickly see this by inspecting the attributes of the structure message and the keys of the dataflow attribute. Finally, the third line gives you the data structure definition referenced by the data flow. Note that the data flow object references , through its structure attribute, the full data structure definition. The latter, intern, references all the dimensions, attributes and a bunch of other things such as court lists. as you saw before when inspecting the structure message attributes, the coat lists are contained in the message as well. So this is the complete structural metadata of the exchange rate state of flow. I think you may want to read more of the documentation other than the walk-through. The SDMX primer contained in the documentation may be a good starting point to understand the relationships between data flow and data structure and how the data flow references a data structure definition, and that a data structure definition of project may be an unresolved reference, or contain the complete meta-data.

Am 03.08.2022 um 17:31 schrieb Maximilian König @.***>:

 OK, for Bundesbank, the correct modus operandi is:

import pandasdmx as sdmx

bbk = sdmx.Request('BBK') bbk_erx = bbk.datastructure('BBK_ERX') dsd = bbk_erx.structure.BBK_ERX display(dsd.dimensions.components) This works for now.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.