opensearch-project / opensearch-py

Python Client for OpenSearch
https://opensearch.org/docs/latest/clients/python/
Apache License 2.0
350 stars 175 forks source link

[FEATURE] Add the ability to send REST requests via the client #498

Closed Jon-AtAWS closed 11 months ago

Jon-AtAWS commented 1 year ago

Is your feature request related to a problem?

I'm trying to list out loaded ml models via the python client (different feature request on that). The way to do that is to run a match_all query against _plugins/_ml/models. opensearchpy.search doesn't support setting the path like that. I need a clean way to send the request. More generally, I want something like the "requests" library that supports GET, PUT, POST, etc. by URL and body, but that backs out to the cluster with all of the auth, etc. from the client.

What alternatives have you considered?

Even if the below works, it would be nice to have opensearchpy.search support setting the path and body.

I tried:

resp = os_client.transport.perform_request(
    "GET",
    "_plugins/_ml/models/_search",
    body='{"query": {"match_all":{}}}'
    )

But received

Traceback (most recent call last):
  File "/Users/handler/code/wiki/testrest.py", line 24, in <module>
    resp = os_client.transport.perform_request("GET",
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/handler/code/wiki/.venv/lib/python3.11/site-packages/opensearchpy/transport.py", line 409, in perform_request
    raise e
  File "/Users/handler/code/wiki/.venv/lib/python3.11/site-packages/opensearchpy/transport.py", line 370, in perform_request
    status, headers_response, data = connection.perform_request(
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/handler/code/wiki/.venv/lib/python3.11/site-packages/opensearchpy/connection/http_urllib3.py", line 266, in perform_request
    self._raise_error(
  File "/Users/handler/code/wiki/.venv/lib/python3.11/site-packages/opensearchpy/connection/base.py", line 301, in _raise_error
    raise HTTP_EXCEPTIONS.get(status_code, TransportError)(
opensearchpy.exceptions.RequestError: RequestError(400, 'no handler found for uri [/_plugins/_ml/models/_search] and method [GET]', 'no handler found for uri [/_plugins/_ml/models/_search] and method [GET]')
(.venv) [12:01:47] handler wiki: python testrest.py
Traceback (most recent call last):
  File "/Users/handler/code/wiki/.venv/lib/python3.11/site-packages/opensearchpy/connection/http_urllib3.py", line 240, in perform_request
    response = self.pool.urlopen(
               ^^^^^^^^^^^^^^^^^^
  File "/Users/handler/code/wiki/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 656, in urlopen
    raise HostChangedError(self, url, retries)
urllib3.exceptions.HostChangedError: HTTPSConnectionPool(host='localhost', port=9200): Tried to open a foreign host with url: _plugins/_ml/models/_search

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/handler/code/wiki/testrest.py", line 24, in <module>
    resp = os_client.transport.perform_request(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/handler/code/wiki/.venv/lib/python3.11/site-packages/opensearchpy/transport.py", line 407, in perform_request
    raise e
  File "/Users/handler/code/wiki/.venv/lib/python3.11/site-packages/opensearchpy/transport.py", line 370, in perform_request
    status, headers_response, data = connection.perform_request(
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/handler/code/wiki/.venv/lib/python3.11/site-packages/opensearchpy/connection/http_urllib3.py", line 255, in perform_request
    raise ConnectionError("N/A", str(e), e)
opensearchpy.exceptions.ConnectionError: ConnectionError(HTTPSConnectionPool(host='localhost', port=9200): Tried to open a foreign host with url: _plugins/_ml/models/_search) caused by: HostChangedError(HTTPSConnectionPool(host='localhost', port=9200): Tried to open a foreign host with url: _plugins/_ml/models/_search)

The equivalent call via the requests library works fine:

resp = requests.get('https://localhost:9200/_plugins/_ml/models/_search',
                    data='{"query": {"match_all":{}}}',
                    auth=XXXXXXXX
                    verify=False,
                    headers={"Content-Type": "application/json"})

If this is the preferred method, then that needs to be in the documentation somewhere.

dblock commented 1 year ago

This happens because without a leading slash the request is performed against the host called _ml/.... Prefix the path with a /.

resp = client.transport.perform_request(
    "GET", "/_plugins/_ml/models/_search", body='{"query": {"match_all":{}}}'
)

I made a sample that creates an index and indexes a document in https://github.com/dblock/opensearch-python-client-demo/blob/main/sync/transport.py, that should get you started - feel free to add an ml.py sample into that project and I can help you if you run into more problems.

Leaving this issue open. I think we want to expose transport.perform_request as get, put, post and delete.

dtaivpp commented 1 year ago

@Jon-AtAWS you should check out the github.com/opensearch-project/opensearch-py-ml client it has support for these API's. Generally, I agree though. We should provide a generic helper that allows users to run non-implemented API endpoints.

@dblock have you heard any discussion of rolling the ML client into this one? I know there was some talk about allowing different parts to be installed optionally with pip but I cant find that issue. If that was implemented it should make it a more straightforward move.

Jon-AtAWS commented 1 year ago

@dtaivpp - The py-ml client is in a state of flux right now, and doesn't support many of the operations I needed for my implementation. I ended up falling back to the requests library to manual those. Once I complete my example, I'll publish that somewhere and add a couple improvement ideas to the py-ml client.

Having said that, this request is more about a general way to send REST requests (per @dblock above as well). The Dev Tools tab from Dashboards makes it simple to send these requests. I'm looking for something like that in code.

dblock commented 1 year ago

@dblock have you heard any discussion of rolling the ML client into this one?

I have not. Maybe you want to open/find an existing issue? This client has been adding support for various plugins for a while.

navneet1v commented 1 year ago

I think we want to expose transport.perform_request as get, put, post and delete.

I am able to reuse the client.transport.perform_request to send api request to KNN APIs which are not part of the client. Like this client.transport.perform_request('GET', f'/_plugins/_knn/warmup/{vector_index_name}', )

@dblock I think the feature is already there.

dblock commented 1 year ago

@dblock I think the feature is already there.

I think we still need a DSL, e.g. client.get or client.http_get or something like that, that is documented and tested explicitly. Right now perform_request is undocumented and internal.

navneet1v commented 1 year ago

Oh yes.. documentation is required and not sure about what DSL means here.

domain specific language - I mean I'd like to write client.get rather than client.transport.perform_request.

@dblock can we do this for all the client in different languages. Every time some customer ask this I had to dig into the code to find out how to do such requests..

Absolutely.

dblock commented 1 year ago

I've opened an umbrella issue to collect this and similar requests across clients, https://github.com/opensearch-project/opensearch-clients/issues/62.

dblock commented 12 months ago

PR in https://github.com/opensearch-project/opensearch-py/pull/544