dr-leo / intake_sdmx

Intake plugin based on pandaSDMX
Apache License 2.0
0 stars 2 forks source link

List of SDMX sources incomplete? #7

Closed jfix closed 2 years ago

jfix commented 2 years ago

I have just started playing around with intake_sdmx and pandasdmx, so please forgive me if I'm getting things wrong.

I expected the list of data sources to be the same for both libraries (but that's maybe a wrong assumption!). Following the steps in the user guide for intake_sdmx, I get this list:

['ABS_XML',
 'Australian Bureau of Statistics - XML',
 'BBK',
 'Deutsche Bundesbank (German Central Bank)',
 'BIS',
 'Bank for International Settlements',
 'ECB',
 'European Central Bank',
 'ESTAT',
 'Eurostat',
 'ILO',
 'International Labor Organization',
 'IMF',
 'International Monetary Fund',
 'INEGI',
 'Instituto Nacional de Estadística y Geografía (MX)',
 'INSEE',
 'Institut national de la statistique et des études économiques (FR)',
 'ISTAT',
 'Instituto Nationale di Statistica (IT)',
 'NB',
 'Norges Bank (NO)',
 'SGR',
 'SDMX Global Registry',
 'UNICEF',
 "UN International Children's Emergency Fund",
 'CD2030',
 'COUNTDOWN 2030',
 'SPC',
 'Pacific Data Hub',
 'UNSD',
 'United Nations Statistics Division',
 'WB',
 'World Bank World Integrated Trade Solution',
 'WB_WDI',
 'World Bank World Development Indicators',
 'LSD',
 'Statistics Lithuania']

When doing an sdmx.list_sources() from pandasdmx, I get this result:

['ABS',
 'ABS_XML',
 'BBK',
 'BIS',
 'CD2030',
 'ECB',
 'ESTAT',
 'ILO',
 'IMF',
 'INEGI',
 'INSEE',
 'ISTAT',
 'LSD',
 'NB',
 'NBB',
 'OECD',
 'SGR',
 'SPC',
 'STAT_EE',
 'UNICEF',
 'UNSD',
 'WB',
 'WB_WDI']

The pandasdmx list of data sources seems to be more complete than the intake_sdmx one. Is this on purpose or by design? In particular I would like to use the OECD data source. Does that mean I should use only pandasdmx?

Thank you for your consideration.

dr-leo commented 2 years ago

Hi, Thanks for your interest in this tiny project. OECD and other data sources supported by pandaSDMX are lacking here because they do not support the full set of structural meta-data required to browse code lists etc. So it was a deliberate decision to exclude these agencies. You could prove me wrong by establishing that some of the excluded data sources do support these features. In the future though there is hope that more agencies will support the full standard, notably as SDMX 3.0 comprises eight JSON based implementation for structural meta-data.

Am 26.05.2022 um 17:56 schrieb Jakob Fix @.***>:

 I have just started playing around with intake_sdmx and pandasdmx, so please forgive me if I'm getting things wrong.

I expected the list of data sources to be the same for both libraries (but that's maybe a wrong assumption!). Following the steps in the user guide for intake_sdmx, I get this list:

['ABS_XML', 'Australian Bureau of Statistics - XML', 'BBK', 'Deutsche Bundesbank (German Central Bank)', 'BIS', 'Bank for International Settlements', 'ECB', 'European Central Bank', 'ESTAT', 'Eurostat', 'ILO', 'International Labor Organization', 'IMF', 'International Monetary Fund', 'INEGI', 'Instituto Nacional de Estadística y Geografía (MX)', 'INSEE', 'Institut national de la statistique et des études économiques (FR)', 'ISTAT', 'Instituto Nationale di Statistica (IT)', 'NB', 'Norges Bank (NO)', 'SGR', 'SDMX Global Registry', 'UNICEF', "UN International Children's Emergency Fund", 'CD2030', 'COUNTDOWN 2030', 'SPC', 'Pacific Data Hub', 'UNSD', 'United Nations Statistics Division', 'WB', 'World Bank World Integrated Trade Solution', 'WB_WDI', 'World Bank World Development Indicators', 'LSD', 'Statistics Lithuania'] When doing an sdmx.list_sources() from pandasdmx, I get this result:

['ABS', 'ABS_XML', 'BBK', 'BIS', 'CD2030', 'ECB', 'ESTAT', 'ILO', 'IMF', 'INEGI', 'INSEE', 'ISTAT', 'LSD', 'NB', 'NBB', 'OECD', 'SGR', 'SPC', 'STAT_EE', 'UNICEF', 'UNSD', 'WB', 'WB_WDI'] The pandasdmx list of data sources seems to be more complete than the intake_sdmx one. Is this on purpose or by design? In particular I would like to use the OECD data source. Does that mean I should use only pandasdmx?

Thank you for your consideration.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.

jfix commented 2 years ago

Thanks for your reply, @dr-leo. That makes sense.

Could that lack of structural metadata also be the reason why I have trouble reproducing the Walkthrough you provided for pandasdmx for the OECD data source? (I can ask the question in the pandasdmx project if you prefer.) The following seems to indicate that the dataflow() method is not implemented on OECD's side.

---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
Input In [79], in <cell line: 2>()
      1 oecd = sdmx.Request('OECD')
----> 2 flow_msg = oecd.dataflow()
      3 flow_msg
dr-leo commented 2 years ago

That's right. The walkthrough example uses the ECB's API as it is richer than the OECD's. See the docs on data sources for details.