dr-leo / pandaSDMX

Python interface to SDMX
Apache License 2.0
127 stars 59 forks source link

Walkthrough not working with oecd #186

Closed Didou09 closed 4 years ago

Didou09 commented 4 years ago

Hi,

First, thanks for this very good idea of a library and for making it accessible.

I'm just starting with it and it seems the walkthrough example you provide in the dcs does not work for the data source I am interested in: oecd

In [1]:  import pandasdmx as sdmx
In [2]:  oecd = sdmx.Request('OECD')

When I try to get the data flows definitions (because I don't know them), I get a NotImplementedError:

In [3]:  flow_msg = oecd.dataflow('all')                                                                                                   
---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-17-59bac2cb21cf> in <module>
----> 1 flow_msg = oecd.dataflow('all')

~/anaconda3/lib/python3.7/site-packages/pandasdmx/api.py in get(self, resource_type, resource_id, tofile, use_cache, dry_run, **kwargs)
    394         else:
    395             kwargs.update(dict(resource_type=resource_type, resource_id=resource_id))
--> 396             req = self._request_from_args(kwargs)
    397 
    398         req = self.session.prepare_request(req)

~/anaconda3/lib/python3.7/site-packages/pandasdmx/api.py in _request_from_args(self, kwargs)
    193         if not (force or self.source.supports[resource_type]):
    194             raise NotImplementedError(
--> 195                 f"{self.source.id} does not support the"
    196                 f"{resource_type!r} API endpoint; "
    197                 "override using force=True"

NotImplementedError: OECD does not support the<Resource.dataflow: 'dataflow'> API endpoint; override using force=True

Does it have to do with SMDX-JSON that the oecd may be using ? Anyway to circumvent this difficulty ? (I don't know the data identifiers yet, I wanted to explore the data from my ipython console)

dr-leo commented 4 years ago

Hi, you are right. The OECDAPI only supports data requests. I’m afraid you will have to retrieve the data flow IDs manually from the OEC D’s website.

Am 05.10.2020 um 23:36 schrieb Didier notifications@github.com:

 Hi,

First, thanks for this very good idea of a library and for making it accessible.

I'm just starting with it and it seems the walkthrough example you provide in the dcs does not work for the data source I am interested in: oecd

In [1]: import pandasdmx as sdmx In [2]: oecd = sdmx.Request('OECD') When I try to get the data flows definitions (because I don't know them), I get a NotImplementedError:

In [3]: flow_msg = oecd.dataflow('all')

NotImplementedError Traceback (most recent call last)

in ----> 1 flow_msg = oecd.dataflow('all') ~/anaconda3/lib/python3.7/site-packages/pandasdmx/api.py in get(self, resource_type, resource_id, tofile, use_cache, dry_run, **kwargs) 394 else: 395 kwargs.update(dict(resource_type=resource_type, resource_id=resource_id)) --> 396 req = self._request_from_args(kwargs) 397 398 req = self.session.prepare_request(req) ~/anaconda3/lib/python3.7/site-packages/pandasdmx/api.py in _request_from_args(self, kwargs) 193 if not (force or self.source.supports[resource_type]): 194 raise NotImplementedError( --> 195 f"{self.source.id} does not support the" 196 f"{resource_type!r} API endpoint; " 197 "override using force=True" NotImplementedError: OECD does not support the API endpoint; override using force=True Does it have to do with SMDX-JSON that the oecd may be using ? Anyway to circumvent this difficulty ? (I don't know the data identifiers yet, I wanted to explore the data from my ipython console) — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.
dannyvw-on-github commented 4 years ago

I am still learning and I when looking up how to retrieve data and metadata from OECD and during this ongoing attempt I have found this web page: https://data.oecd.org/api/sdmx-ml-documentation/ And when trying out while learning I stumbled on this https://stats.oecd.org/restsdmx/sdmx.ashx/GetDataStructure/MEI where I suspect to see code lists. I was wondering how I could automatically retrieve and use this with pandasdmx. In the doc of pandasdmx I read: "A key difference is between sources offering SDMX-ML and SDMX-JSON APIs. SDMX-JSON APIs do not support metadata, or structure queries; only data queries" and the above suggests to me that we can use the SDMX-ML protocol with OECD now and hence retrieve at least some metadata (whatever is possible) via SDMX-ML?
So my question is how far can we go in making it easy to retrieve data and to some extent metadata related info with pandasdmx via SDMX-ML for OECD? Could, for example, editing the entry for OECD in the sources.json help in some way?

dr-leo commented 4 years ago

Hi, Thank you for getting in touch. It would certainly be great if OECD supported SDMXML 2.1 metadata queries. However, a quick look add the OECD documentation at the link you sent suggests that it is about an ST MX 1.0 compliant API. It might also be outdated. Prove me wrong! That said, the JSON implementation of SDMX 2.1 has been extended to cover structure messages. So there is a chance that future versions of the OECD API will support Json-based structure messages as well, if it doesn't so alredy on an experimental basis; to be checked. And hopefully pandaSDMX will support JSON structure messages as well someday.

Note that JSON-based data messages do contain a lot of information on the underlying datastructure, such as code descriptions. I think these are exposed by pandaSDMX in the DataMessage object as Codelists as well. That said, there is no way to avoid using the OECD's website to retrieve dataflows and the complete structural metadata. Once you have found a dataflow of interest, you can use pandaSDMX to query suitable datasets. Sorry for this somewhat unsatisfactory answer. Further comments welcome.

Thanks again for your question and best regards Leo

Am 19.10.2020 um 17:13 schrieb dannyvw-on-github notifications@github.com:



I am still learning and I when looking up how to retrieve data and metadata from OECD and during this ongoing attempt I have found this web page: https://data.oecd.org/api/sdmx-ml-documentation/ And when trying out while learning I stumbled on this https://stats.oecd.org/restsdmx/sdmx.ashx/GetDataStructure/MEI where I suspect to see code lists. In the doc of pandasdmx I read: "A key difference is between sources offering SDMX-ML and SDMX-JSON APIs. SDMX-JSON APIs do not support metadata, or structure queries; only data queries" and the above suggests to me that we can use the SDMX-ML protocol with OECD now and hence retrieve metadata? So my question is whether it is possible to use this instead of the JSON approach? And can this be achieved by changing the entry for OECD in the sources.json in some way? I miss the knowledge to see whether this is possible, and if so to what extend (how much metadata and what kind) and also if so, how.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.[image]