dr-leo / pandaSDMX

Python interface to SDMX
Apache License 2.0
126 stars 58 forks source link

Cant read from files #158

Closed Drac0666 closed 4 years ago

Drac0666 commented 4 years ago

I cannot open localy saved XML file, and cant even cant open it from url

https://websvcgatewayx2.frbny.org/autorates_fedfunds_external/services/v1_0/fedfunds/xml/retrieve?typ=RATE&f=03012016&t=04032020

path =  'D:\\retrieve.xml'
import pandasdmx as sdmx
r = sdmx.Request()
b = r.get(fromfile=path)

`---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)
D:\ProgramData\Anaconda3\lib\site-packages\pandasdmx\reader\sdmxml.py in initialize(self, source)
     71                                                                   'references': None},
---> 72                                                               memcache=cache_id).datastructure[dsd_id]
     73                     except Exception:

D:\ProgramData\Anaconda3\lib\site-packages\pandasdmx\api.py in get(self, resource_type, resource_id, agency, version, key, params, headers, fromfile, tofile, url, get_footer_url, memcache, writer, dsd, series_keys)
    367                 raise ValueError(
--> 368                     'If `` url`` is not kurwa specified, either agency or fromfile must be given.')
    369 

ValueError: If `` url`` is not kurwa specified, either agency or fromfile must be given.

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-11-3bb7d33dbba1> in <module>
      3 import pandasdmx as sdmx
      4 r = sdmx.Request()
----> 5 b = r.get(fromfile=path)

D:\ProgramData\Anaconda3\lib\site-packages\pandasdmx\api.py in get(self, resource_type, resource_id, agency, version, key, params, headers, fromfile, tofile, url, get_footer_url, memcache, writer, dsd, series_keys)
    401                 reader_module = import_module('pandasdmx.reader.sdmxml')
    402             reader_cls = reader_module.Reader
--> 403             msg = reader_cls(self, dsd).initialize(source)
    404         # Check for URL in a footer and get the real data if so configured
    405         if get_footer_url and hasattr(msg, 'footer'):

D:\ProgramData\Anaconda3\lib\site-packages\pandasdmx\reader\sdmxml.py in initialize(self, source)
     79                                                               params={
     80                                                                   'references': None},
---> 81                                                               memcache=cache_id).datastructure[dsd_id]
     82 
     83                 # extract dimension and attribute IDs from the DSD for later

D:\ProgramData\Anaconda3\lib\site-packages\pandasdmx\api.py in get(self, resource_type, resource_id, agency, version, key, params, headers, fromfile, tofile, url, get_footer_url, memcache, writer, dsd, series_keys)
    366             else:
    367                 raise ValueError(
--> 368                     'If `` url`` is not kurwa specified, either agency or fromfile must be given.')
    369 
    370         # Now get the SDMX message either via http or as local file

ValueError: If `` url`` is not kurwa specified, either agency or fromfile must be given.
`
khaeru commented 4 years ago

Hi @Drac0666—thanks for opening this issue. I didn't know the FRBNY provided a SDMX data, so thanks for bringing that to our attention.

A few points of clarification:

Follow-up points for @dr-leo or I:

khaeru commented 4 years ago

On further investigation, this is "structure-specific" data.

The URL: https://www.newyorkfed.org/resources/sdmxml/schemas/V2_1/fundRateStructure.xml (also in the PDF link) returns an HTML page indicating the correct URL is now: https://apps.newyorkfed.org/~/media/XML/Schemas/fundRateStructure.xml

Get the structure and the data:

$ curl -O https://apps.newyorkfed.org/~/media/XML/Schemas/fundRateStructure.xml
$ curl -o message.xml "https://websvcgatewayx2.frbny.org/autorates_fedfunds_external/services/v1_0/fedfunds/xml/retrieve?typ=RATE&f=03012016&t=04032020"

Then in Python:

>>> import pandasdmx as sdmx
>>> structure_msg = sdmx.read_sdmx('fundRateStructure.xml')                                           
>>> structure_msg     
<pandasdmx.StructureMessage>
  <Header>
    id: 'FUNDRATE_STRUCTURE'
    prepared: '2015-11-18'
    sender: 'FRBNY'
  Codelist (10): CL_CONF_STATUS CL_CURRENCY CL_DECIMALS CL_FREQ CL_FUND...
  ConceptScheme (2): CROSS_DOMAIN_CONCEPTS FRBNYB_CONCEPTS
  DataStructureDefinition (2): FRBNY_FUNDRATE_RATE_STRUCTURE FRBNY_FUND...
>>> dsd = structure_msg.structure['FRBNY_FUNDRATE_RATE_STRUCTURE']
>>> data_msg = sdmx.read_sdmx('message.xml', dsd=dsd)
>>> sdmx.to_pandas(data_msg.data[0])
FREQ  {http://www.w3.org/2001/XMLSchema-instance}type  FUNDRATE_OBS_POINT  FUNDRATE_TYPE  TIME_PERIOD
B     ns13:ObsType                                     1%                  EFFR           2016-03-01     0.34
                                                                                          2016-03-02     0.33
                                                                                          2016-03-03     0.34
                                                                                          2016-03-04     0.34
                                                                                          2016-03-07     0.34
                                                                                                         ... 
                                                       TARGET_LOW          EFFR           2020-03-27     0.00
                                                                                          2020-03-30     0.00
                                                                                          2020-03-31     0.00
                                                                                          2020-04-01     0.00
                                                                                          2020-04-02     0.00
Name: value, Length: 7203, dtype: float64

Still something weird going on with that 'type' column, but in the meanwhile you can drop it:

>>> sdmx.to_pandas(data_msg.data[0]) \
        .droplevel('{http://www.w3.org/2001/XMLSchema-instance}type')
FREQ  FUNDRATE_OBS_POINT  FUNDRATE_TYPE  TIME_PERIOD
B     1%                  EFFR           2016-03-01     0.34
                                         2016-03-02     0.33
                                         2016-03-03     0.34
                                         2016-03-04     0.34
                                         2016-03-07     0.34
                                                        ... 
      TARGET_LOW          EFFR           2020-03-27     0.00
                                         2020-03-30     0.00
                                         2020-03-31     0.00
                                         2020-04-01     0.00
                                         2020-04-02     0.00
Name: value, Length: 7203, dtype: float64
dr-leo commented 4 years ago

This seems to be stale. Pls re-open if any further support is needed.

chian-007 commented 4 years ago

tried to repeat the ten line code but got the following message at the step:6: pandasdmx.reader.sdmxml - WARNING: Ambiguous: dsd=… argument for non–structure-specific message.

Thanks,

Qian

dr-leo commented 4 years ago

Thanks. The warning is a nuisance and a fall-out from merging code from khaeru's fork. It is erroneously raised when you request a dataset providing a dict-typed key. To construct the URL including the key string, pandaSDMX requests the DSD on the fly. I think it should then request a structure-specific dataset, rather than generic. You can safely ignore the warning. I hope to fix it shortly. Any PR would be welcome though.

On 06/08/2020, jianlong notifications@github.com wrote:

tried to repeat the ten line code but got the following message at the step:6:

pandasdmx.reader.sdmxml - WARNING: Ambiguous: dsd=… argument for non–structure-specific message.

Thanks,

Qian

--

You are receiving this because you modified the open/close state.

Reply to this email directly or view it on GitHub:

https://github.com/dr-leo/pandaSDMX/issues/158#issuecomment-669964689