Closed albertame closed 3 years ago
Would it be possible to make very=false as default, or adding an additional parameter for that?
Do you mean verify=False
? This is already explicitly collected in Client.get()
and passed on:
https://github.com/khaeru/sdmx/blob/589928c4238cb911796b3051b7e2262a5c0603db/sdmx/client.py#L402-L405
Have you tried it? What was the output?
P.S. I edited the issue description to add triple backticks (```) around your code snippet. This makes reading much easier. See the Markdown help linked from the little icon in the bottom-right of the text box: https://guides.github.com/features/mastering-markdown/
yes, I meant verify=False
. I have never used the get()
function, the parameters to be passed are different than data()
. But looking at the URL passed, I think for ECB dataflow should be replaced by data.
Thank you also for the advice on the triple backticks.
I have never used the get() function, the parameters to be passed are different than data().
data(…)
is nothing more than a shortcut for get(resource_type="data", …)
. Extra keyword arguments like verify should be handled the same way by each.
But looking at the URL passed, I think for ECB dataflow should be replaced by data.
No, this is as expected. See the documentation for get()
around:
…the key argument is validated against the relevant
DataStructureDefinition
, either given with the dsd keyword argument, or retrieved from the web service before the main query.
The failure you're seeing occurs during this step, before the “main query”.
ecb.get(resource_type = 'data',
resource_id = 'YC',
dsd = 'ECB_FMD2',
key={'FREQ': 'B',
'REF_AREA': 'U2',
'CURRENCY': 'EUR',
'PROVIDER_FM': '4F',
'INSTRUMENT_FM': 'G_N_A',
'PROVIDER_FM_ID': 'SV_C_YM',
'DATA_TYPE_FM': 'BETA0+BETA1+BETA2+BETA3+TAU1+TAU2'},
params = {'startPeriod': '2007-01-01'},
verify = False)
Traceback (most recent call last):
File "<ipython-input-86-c823d8e1c2d4>", line 1, in <module>
ecb.get(resource_type = 'data',
File "C:\Users\al005366\AppData\Roaming\Python\Python39\site-packages\sdmx\client.py", line 411, in get
req = self._request_from_args(kwargs)
File "C:\Users\al005366\AppData\Roaming\Python\Python39\site-packages\sdmx\client.py", line 233, in _request_from_args
key, dsd = self._make_key(resource_type, resource_id, key, dsd)
File "C:\Users\al005366\AppData\Roaming\Python\Python39\site-packages\sdmx\client.py", line 157, in _make_key
cc = dsd.make_constraint(key)
AttributeError: 'str' object has no attribute 'make_constraint'
ecb.get(resource_type = 'data',
resource_id = 'YC',
key={'FREQ': 'B',
'REF_AREA': 'U2',
'CURRENCY': 'EUR',
'PROVIDER_FM': '4F',
'INSTRUMENT_FM': 'G_N_A',
'PROVIDER_FM_ID': 'SV_C_YM',
'DATA_TYPE_FM': 'BETA0+BETA1+BETA2+BETA3+TAU1+TAU2'},
params = {'startPeriod': '2007-01-01'},
verify = False)
Traceback (most recent call last):
File "<ipython-input-87-0be03fbe7e40>", line 1, in <module>
ecb.get(resource_type = 'data',
File "C:\Users\al005366\AppData\Roaming\Python\Python39\site-packages\sdmx\client.py", line 411, in get
req = self._request_from_args(kwargs)
File "C:\Users\al005366\AppData\Roaming\Python\Python39\site-packages\sdmx\client.py", line 233, in _request_from_args
key, dsd = self._make_key(resource_type, resource_id, key, dsd)
File "C:\Users\al005366\AppData\Roaming\Python\Python39\site-packages\sdmx\client.py", line 141, in _make_key
self.dataflow(
File "C:\Users\al005366\AppData\Roaming\Python\Python39\site-packages\sdmx\client.py", line 446, in get
raise e from None
File "C:\Users\al005366\AppData\Roaming\Python\Python39\site-packages\sdmx\client.py", line 443, in get
response = self.session.send(req, **send_kwargs)
File "c:\program files\python39\lib\site-packages\requests\sessions.py", line 655, in send
r = adapter.send(request, **kwargs)
File "c:\program files\python39\lib\site-packages\requests\adapters.py", line 514, in send
raise SSLError(e, request=request)
SSLError: HTTPSConnectionPool(host='sdw-wsrest.ecb.europa.eu', port=443): Max retries exceeded with url: /service/dataflow/ECB/YC/latest?references=all (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1129)')))
Ok. I have tried many different combination, still cannot get the data from ECB. Could you please provide me a functioning example?
Thanks for providing these examples. I think we're getting closer to diagnosing the issue.
dsd = 'ECB_FMD2'
is incorrect. The documentation says that this argument should be a DataStructureDefinition object, not a string. However, this could be made more user-friendly, i.e. the code should raise an informative TypeError if you give an argument of the wrong type. I'll make a note to add this in the future!In [3]: ecb.data(
...: "YC",
...: key={
...: 'FREQ': 'B',
...: 'REF_AREA': 'U2',
...: 'CURRENCY': 'EUR',
...: 'PROVIDER_FM': '4F',
...: 'INSTRUMENT_FM': 'G_N_A',
...: 'PROVIDER_FM_ID': 'SV_C_YM',
...: 'DATA_TYPE_FM': 'BETA0+BETA1+BETA2+BETA3+TAU1+TAU2'
...: },
...: params = {'startPeriod': '2007-01-01'}
...: )
Out[3]:
<sdmx.DataMessage>
<Header>
id: '9692e614-509b-4ce7-8e4f-92e3949239b5'
prepared: '2021-05-27T15:19:21.441000+02:00'
sender: <Agency ECB>
source:
test: False
response: <Response [200]>
DataSet (1)
dataflow: <DataflowDefinition (missing id)>
observation_dimension: <TimeDimension TIME_PERIOD>
Can you say:
pip show sdmx1
?curl
on the command-line, what happens? If this fails, it may indicate a network error, or that the server (for some reason) is giving you and I different responses. That would be unrelated to the code of sdmx1 per se.Name: sdmx1
Version: 2.4.1
Summary: Statistical Data and Metadata eXchange (SDMX)
Home-page: https://github.com/khaeru/sdmx
Author: SDMX Python developers
Author-email: mail@paul.kishimoto.name
License: UNKNOWN
Location: c:\users\al005366\appdata\roaming\python\python39\site-packages
Requires: lxml, pandas, pydantic, setuptools, python-dateutil, requests
Required-by:
* it downloads a file called **latest**
"It" meaning a browser, or curl
?
If a browser works, but the Python code (run at the same moment—did you try to re-run it just now?)¹ still does not, then this indicates that the way your browser connects to the server (sdw-wsrest.ecb.europa.eu) is somehow different than the way Python does.
That could have many causes, e.g. using a proxy which is properly configured in your browser, but you are not giving the same proxy settings to Python requests
via sdmx1
. Or (possibly) the "self-signed certificate" per the error message was directly installed in your browser and is recognized, whereas it's not recognized by requests
.
Again, these would not be due to the code in this package, so I can only provide very limited help. I'd suggest maybe you look at the snippet at the top of the requests
docs: https://docs.python-requests.org/en/master/ and try to run similar code, like:
r = requests.get(" https://sdw-wsrest.ecb.europa.eu/service/dataflow/ECB/YC/latest?references=all")
If this fails, then the problem is not in sdmx1
.
¹ To emphasize: because a web server's status may change from moment to moment, then when we're comparing its response to 2+ different requests (browser vs. Python code) we have to run them at the same time. "Request A worked yesterday, Request B works today" is not enough to diagnose.
Thanks a lot for the help!
"It" means a browser.
Just to conclude, this works:
requests.get("https://sdw-wsrest.ecb.europa.eu/service/dataflow/ECB/YC/latest?references=all", verify = False)
This, does not work: requests.get("https://sdw-wsrest.ecb.europa.eu/service/dataflow/ECB/YC/latest?references=all")
Thanks for this confirmation. I've done some testing (details below the line), and the failure appears to be here:
…the key argument is validated against the relevant
DataStructureDefinition
, either given with the dsd keyword argument, or retrieved from the web service before the main query.The failure you're seeing occurs during this step, before the “main query”.
Specifically verify=False is not being applied to this preliminary query, so that fails—even though the main query does get the right setting ("Case 0" below). So we've found a bug! Thanks for the help in diagnosing it. I'll fix when possible.
In the meantime, one way to work around this ("Case 2" below):
This also has the advantage of being faster, as the data structure query is big & slow.
You could also use 2 separate queries:
# Retrieve the DataflowDefiniton and all related structures
structure_msg = ECB.dataflow("YC", verify=False)
# Get the associated DataStructureDefinition
dsd = structure_msg.dataflow["YC"].structure
# Use the already-retrieved DSD to convert `dict_key` to a string;
# no preliminary query is performed; avoids the bug
data_msg = ECB.data("YC", key=dict_key, dsd=dsd, verify=False)
To confirm: insert print("send_kwargs:", send_kwargs)
before this line:
https://github.com/khaeru/sdmx/blob/589928c4238cb911796b3051b7e2262a5c0603db/sdmx/client.py#L443
Then run:
import sdmx
ECB = sdmx.Client("ECB")
# Dimensions are in order
dict_key = {
"FREQ": "B",
"REF_AREA": "U2",
"CURRENCY": "EUR",
"PROVIDER_FM": "4F",
"INSTRUMENT_FM": "G_N_A",
"PROVIDER_FM_ID": "SV_C_YM",
"DATA_TYPE_FM": "BETA0+BETA1+BETA2+BETA3+TAU1+TAU2",
}
str_key = ".".join(dict_key.values())
for case, (verify, validate, key) in enumerate((
(False, True, dict_key),
(True, True, dict_key),
(False, False, str_key),
(False, False, dict_key),
)):
print("Case", case)
print("verify:", verify)
print("validate:", validate)
print("key type:", type(key))
ECB.data(
"YC",
key=key,
params={'startPeriod': '2007-01-01'},
verify=verify,
validate=validate,
)
Output:
Case 0
verify: False
validate: True
key type: <class 'dict'>
send_kwargs: {'verify': True, 'proxies': OrderedDict(), 'stream': False, 'cert': None}
send_kwargs: {'verify': False, 'proxies': OrderedDict(), 'stream': False, 'cert': None}
Case 1
verify: True
validate: True
key type: <class 'dict'>
send_kwargs: {'verify': True, 'proxies': OrderedDict(), 'stream': False, 'cert': None}
Case 2
verify: False
validate: False
key type: <class 'str'>
send_kwargs: {'verify': False, 'proxies': OrderedDict(), 'stream': False, 'cert': None}
Case 3
verify: False
validate: False
key type: <class 'dict'>
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-5-eb72a280b29d> in <module>
28 print("key type:", type(key))
29
---> 30 ECB.data(
31 "YC",
32 key=key,
~/vc/sdmx/sdmx/client.py in get(self, resource_type, resource_id, tofile, use_cache, dry_run, **kwargs)
409 req = self._request_from_url(kwargs)
410 else:
--> 411 req = self._request_from_args(kwargs)
412
413 req = self.session.prepare_request(req)
~/vc/sdmx/sdmx/client.py in _request_from_args(self, kwargs)
237
238 # Assemble final URL
--> 239 url = "/".join(filter(None, url_parts))
240
241 # Parameters: set 'references' to sensible defaults
TypeError: sequence item 3: expected str instance, dict found
Thanks for the help!
Closed in #80 and will be released in the next version after 2.4.1.
Would it be possible to make very=false as default, or adding an additional parameter for that? It worked fine until last week.
The problem seems to be specific for the ECB (SDW) data portal. I don't have the same problem for EUROSTAT for example.
Many thanks in advance. Kind regards, Alberto