OCHA-DAP / hdx-python-api

Python API for interacting with the HDX Data Portal
http://data.humdata.org
MIT License
80 stars 16 forks source link

Getting "Failed when trying to read: q=*:*! (POST)" when retrieving datasets #53

Closed dividor closed 1 year ago

dividor commented 1 year ago

Hi!

I have some code that has been running fine in various environments but now seems to return an error (please see below). It was working fine but seemed to have stopped.

Is there something I'm doing wrong please?

Thanks!

Environment:

Code:

from hdx.utilities.easy_logging import setup_logging
from hdx.api.configuration import Configuration
from hdx.data.dataset import Dataset

def setup_hdx_connection(agent_name):
    try:
        Configuration.create(hdx_site="prod", user_agent=agent_name, hdx_read_only=True)
    except:
        print("Configuration already created, continuing ...")

setup_hdx_connection(f"AgentName")

datasets = Dataset.search_in_hdx()

Error:

---------------------------------------------------------------------------
CKANAPIError                              Traceback (most recent call last)
File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/hdx/data/hdxobject.py:115, in HDXObject._read_from_hdx(self, object_type, value, fieldname, action, **kwargs)
    114 try:
--> 115     result = self.configuration.call_remoteckan(action, data)
    116     return True, result

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/hdx/api/configuration.py:374, in Configuration.call_remoteckan(self, *args, **kwargs)
    373 kwargs["apikey"] = apikey
--> 374 return self.remoteckan().call_action(*args, **kwargs)

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/ckanapi/remoteckan.py:97, in RemoteCKAN.call_action(self, action, data_dict, context, apikey, files, requests_kwargs)
     96     status, response = self._request_fn(url, data, headers, files, requests_kwargs)
---> 97 return reverse_apicontroller_action(url, status, response)

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/ckanapi/common.py:134, in reverse_apicontroller_action(url, status, response)
    133 # don't recognize the error
--> 134 raise CKANAPIError(repr([url, status, response]))

CKANAPIError: ['https://data.humdata.org/api/action/package_search', 403, '<html>\r\n<head><title>403 Forbidden</title></head>\r\n<body>\r\n<center><h1>403 Forbidden</h1></center>\r\n</body>\r\n</html>\r\n']

The above exception was the direct cause of the following exception:

HDXError                                  Traceback (most recent call last)
File <command-2615425375951422>:2
      1 print(f"Searching for ALL datasets in HDX to get datasets")
----> 2 datasets = Dataset.search_in_hdx()

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/hdx/data/dataset.py:1098, in Dataset.search_in_hdx(cls, query, configuration, page_size, **kwargs)
   1096 rows = min(rows_left, page_size)
   1097 kwargs["rows"] = rows
-> 1098 _, result = dataset._read_from_hdx(
   1099     "dataset",
   1100     query,
   1101     "q",
   1102     Dataset.actions()["search"],
   1103     **kwargs,
   1104 )
   1105 datasets = list()
   1106 if result:

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/hdx/data/hdxobject.py:120, in HDXObject._read_from_hdx(self, object_type, value, fieldname, action, **kwargs)
    118     return False, f"{fieldname}={value}: not found!"
    119 except Exception as e:
--> 120     raise HDXError(
    121         f"Failed when trying to read: {fieldname}={value}! (POST)"
    122     ) from e

HDXError: Failed when trying to read: q=*:*! (POST)

Note that when running on Desktop I get the above error, but I see this error above it, but in a browser, I can access https://data.humdata.org/api/action/package_search just fine ...

---------------------------------------------------------------------------
CKANAPIError                              Traceback (most recent call last)
File [~/opt/miniconda3/envs/ddenv/lib/python3.8/site-packages/hdx/data/hdxobject.py:115](https://file+.vscode-resource.vscode-cdn.net/Users/matthewharris/Desktop/git/humanitarian-insights-platform/~/opt/miniconda3/envs/ddenv/lib/python3.8/site-packages/hdx/data/hdxobject.py:115), in HDXObject._read_from_hdx(self, object_type, value, fieldname, action, **kwargs)
    114 try:
--> 115     result = self.configuration.call_remoteckan(action, data)
    116     return True, result

File [~/opt/miniconda3/envs/ddenv/lib/python3.8/site-packages/hdx/api/configuration.py:374](https://file+.vscode-resource.vscode-cdn.net/Users/matthewharris/Desktop/git/humanitarian-insights-platform/~/opt/miniconda3/envs/ddenv/lib/python3.8/site-packages/hdx/api/configuration.py:374), in Configuration.call_remoteckan(self, *args, **kwargs)
    373 kwargs["apikey"] = apikey
--> 374 return self.remoteckan().call_action(*args, **kwargs)

File [~/opt/miniconda3/envs/ddenv/lib/python3.8/site-packages/ckanapi/remoteckan.py:97](https://file+.vscode-resource.vscode-cdn.net/Users/matthewharris/Desktop/git/humanitarian-insights-platform/~/opt/miniconda3/envs/ddenv/lib/python3.8/site-packages/ckanapi/remoteckan.py:97), in RemoteCKAN.call_action(self, action, data_dict, context, apikey, files, requests_kwargs)
     96     status, response = self._request_fn(url, data, headers, files, requests_kwargs)
---> 97 return reverse_apicontroller_action(url, status, response)

File [~/opt/miniconda3/envs/ddenv/lib/python3.8/site-packages/ckanapi/common.py:134](https://file+.vscode-resource.vscode-cdn.net/Users/matthewharris/Desktop/git/humanitarian-insights-platform/~/opt/miniconda3/envs/ddenv/lib/python3.8/site-packages/ckanapi/common.py:134), in reverse_apicontroller_action(url, status, response)
    133 # don't recognize the error
--> 134 raise CKANAPIError(repr([url, status, response]))

CKANAPIError: ['https://data.humdata.org/api/action/package_search', 403, '\r\n\r\n\r\n
403 Forbidden
\r\n\r\n\r\n']
mcarans commented 1 year ago

Please can you tell me what user agent you are passing to the Configuration create call.

dividor commented 1 year ago

Hi Mike, In the above example, I sent 'UserAgent' as I was testing, but usually, it's 'DK_UserAgent'.

dividor commented 1 year ago

It works today! Closing the ticket.

dividor commented 1 year ago

Thanks if something was adjusted your side, otherwise if something I was doing incorrectly my side please let me know and I'll adjust accordingly.

Either way, love HDX and the Python API!

cafuego commented 1 year ago

Hey @dividor I did make an adjustment on our side, but I was not able to replicate the problem with a user-agent of DK_UserAgent or find any logs in our web application firewall that it had been blocked. So I'm not certain that the web application firewall did the blocking :-/

If it should happen again, can you re-open the ticket and include the full set of response headers? That would help me narrow down the cause.

dividor commented 1 year ago

Thanks a lot for investigating. I'll be sure to include the requested information if the issue reoccurs.