Closed davehouser1 closed 1 month ago
Hey @davehouser1, sorry that this is giving you trouble.
I tried reading the documentation for the EnrichClient and Elasticsearch classes, however its not clear to me what a lot of these parameters really do. I tried reading the source code but I didn't see many comments for what the parameters actually do.
The parameters are documented in the Elasticsearch docs, which is linked from the client docs. For execute_policy
, the link is https://www.elastic.co/guide/en/elasticsearch/reference/8.14/execute-enrich-policy-api.html, which should explain what the parameters do.
I forgot to mention. I do not see this behavior when using curl, requests library, or the dev console. The response is an async task number, which I can check status on.
I believe the difference is that in Python you're setting wait_for_completion
to True
, which will wait for the policy to execute, and can easily timeout. Can you please try setting wait_for_completion
to False
instead?
I tried setting the request_timeout in Elasticsearch() instance, but that causes a different problem where I see nothing but the following when sending requests
I think this is because the timeout was not high enough, meaning that the connections were discarded for not working? I'm not sure here, honestly, and would welcome a script that reproduces this, as we're considering disabling timeouts in future versions of the client.
Thanks for the response @pquentin.
The parameters are documented in the Elasticsearch docs, which is linked from the client docs. For execute_policy, the link is https://www.elastic.co/guide/en/elasticsearch/reference/8.14/execute-enrich-policy-api.html, which should explain what the parameters do.
I checked out the link. The link only details one parameter wait_for_completion
. It does not detail what all the other parameters are (error_trace, filter_path, human, pretty). Also what would the link be for the details on all the parameters for Elasticsearch Class?
I believe the difference is that in Python you're setting wait_for_completion to True, which will wait for the policy to execute, and can easily timeout. Can you please try setting wait_for_completion to False instead?
I set wait_for_completion=False
, and disabled request_timeout
in the Elasticsearch instance. This did the trick. Now I get a task ID. Very good. So it seems that the session was timing out because the client was waiting for completion. But setting it to not wait, it goes into an async mode and I have to check status on a ticket. You are good to close this issue. However if you could please read below before doing so.
Kind of unrelated problem, see this post https://github.com/elastic/elasticsearch/issues/70554.
There does not seem to be a good way to check status of a enrichment task using asyncio
as elastic does not show status of a task after enrichment is complete. So async will always return a 404 after waiting for status.
Do you know a way around this using the AsyncElasticsearch?
I found a way to use requests
and gather the .enrichment index value, then check if the _count
has increased. This seems to be the only way I can find a work around. Thoughts?
I checked out the link. The link only details one parameter
wait_for_completion
. It does not detail what all the other parameters are (error_trace, filter_path, human, pretty). Also what would the link be for the details on all the parameters for Elasticsearch Class?
The other parameters work for every single API, so they're not documented for every page. They're documented there: https://www.elastic.co/guide/en/elasticsearch/reference/current/common-options.html.
Regarding the Elasticsearch class, most parameters are defined in https://www.elastic.co/guide/en/elasticsearch/client/python-api/current/connecting.html and https://www.elastic.co/guide/en/elasticsearch/client/python-api/current/config.html. Yes, having the docs in two places is annoying, and a short description should be added in the reference docs too. Is there anything specific you're missing?
Kind of unrelated problem, see this post elastic/elasticsearch#70554.
There does not seem to be a good way to check status of a enrichment task using
asyncio
as elastic does not show status of a task after enrichment is complete. So async will always return a 404 after waiting for status.Do you know a way around this using the AsyncElasticsearch?
I found a way to use
requests
and gather the .enrichment index value, then check if the_count
has increased. This seems to be the only way I can find a work around. Thoughts?
Correct me if I'm wrong, but there seems to be some confusion between AsyncElasticsearch (which is a way to use Python's asyncio module with the Python client) and async APIs in Elasticsearch (which only relates to the Elasticsearch server, independently of the client). Calling async Elasticsearch APIs can be done with both Elasticsearch
and AsyncElasticsearch
.
Can you please show me your requests code? I can help you translating to the equivalent using the client, be it AsyncElasticsearch or Elasticsearch.
Closing I haven’t heard back from you. I will reopen if there are additional questions. Thank you!
I am trying to use
EnrichClient
to sendexecute_policy
. I am seeingurlib3
timeout errors when trying to do this for some of my policies.Here is my code base
Here is what I am seeing on the output
I tried setting the
request_timeout
in Elasticsearch() instance, but that causes a different problem where I see nothing but the following when sending requestsI tried reading the documentation for the EnrichClient and Elasticsearch classes, however its not clear to me what a lot of these parameters really do. I tried reading the source code but I didn't see many comments for what the parameters actually do. Am I doing something wrong here? Why is this failing?
I forgot to mention. I do not see this behavior when using curl, requests library, or the dev console. The response is an async task number, which I can check status on.