Closed prashant-elastic closed 1 month ago
We also checked this on 8.8 branch of github and faced the same issue.
cc: @vidok
We don't handle this sort of throttling when uploading to Elasticsearch - the error says that Elasticsearch used all its memory to ingest data and for now cannot ingest more data and we need to wait.
We need to add this error handling into framework.
For now if you need to go on with your testing, just increase the memory available to Elasticsearch to double of your current value (I see you're allocating 1GB RAM to Elasticsearch which is too little)
@danajuratoni the problem is not Sharepoint-specific too, it's a framework issue
Hey @artem-shelkovnikov Please find the screenshot attached which shows the configuration for the Elasticsearch cloud deployment that we used for testing. Do you recommend having an instance with any other configuration?
Hi @prashant-elastic, indeed - you can see that Master node is 1GB, you need to choose configuration with bigger master node if you want the error to go away while we're addressing the problem.
Hoi @artem-shelkovnikov Sure, I will try configuring an instance with bigger master node.
Hey @artem-shelkovnikov We tried looking for a way to configure an instance with bigger master node but did not find any luck. Can you please let us know from where to configure the same?
Try using 3 zones with 2G size per zone.
I thought we have some form of retry policy in place for backpressure from Elasticsearch on bulk indexing. Based on the attached log file, is the bug here that the general retry mechanism is not working at the framework level?
I thought we have some form of retry policy in place for backpressure from Elasticsearch on bulk indexing. Based on the attached log file, is the bug here that the general retry mechanism is not working at the framework level?
I think we don't have one at all or it's broken
Seems like it retries 3 times with no delay on everything except for conflict errors and gives up?
It retries 3 times only for conflict errors - only ConflictError is caught in except
block, all other errors are just raised immediately
Here's an example retry policy for Elasticsearch bulk requests to consider:
Closing as we've update backpressure logic to retry transparently
Bug Description
Getting 429 too many requests error while indexing the sharepoint documents (large data set 2.5M) to elasticsearch
To Reproduce
Steps to reproduce the behavior:
Expected behavior
All records should be properly indexed in Elasticsearch
Actual behavior
429, too many requests error faced by the user as below and status of sync is Sync failure
ApiError(429, "{'_shards': {'total': 2, 'successful': 0, 'failed': 2, 'failures': [{'shard': 0, 'index': '.elastic-connectors-v1', 'status': 'TOO_MANY_REQUESTS', 'reason': {'type': 'circuit_breaking_exception', 'reason': '[parent] Data too large, data for [indices:admin/refresh[s]] would be [1051923900/1003.1mb], which is larger than the limit of [1020054732/972.7mb], real usage: [1051923680/1003.1mb], new bytes reserved: [220/220b], usages [model_inference=0/0b, inflight_requests=32840454/31.3mb, request=0/0b, fielddata=247/247b, eql_sequence=0/0b]', 'bytes_wanted': 1051923900, 'bytes_limit': 1020054732, 'durability': 'TRANSIENT'}}]}}")
Only 138440 docs got indexed out of 250000 records
Screenshots
Note - This test has been executed on sharepoint server. Attaching log files for more reference