Open ngc4579 opened 1 week ago
@ngc4579 - Would it be possible to provide additional details in To Reproduce
section ?
@rajiv-kv Thanks for getting back. I've added index configuration information. As to cluster configuration - what exactly do you need? Output of GET /_cluster/settings?pretty
?
Regarding sample bulk payload - I'm unsure if I can provide that as that's handled by the connector feeding data into OpenSearch.
Thanks @ngc4579 for reverting.
Output of GET /_cluster/health
would suffice.
If you cannot share the bulk payload, could you share the field mappings of the index to create a test payload
Output of GET /<index>?pretty
@rajiv-kv Output of GET /_cluster/health
:
{
"cluster_name": "opensearch",
"status": "green",
"timed_out": false,
"number_of_nodes": 3,
"number_of_data_nodes": 3,
"discovered_master": true,
"discovered_cluster_manager": true,
"active_primary_shards": 54,
"active_shards": 107,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 0,
"delayed_unassigned_shards": 0,
"number_of_pending_tasks": 0,
"number_of_in_flight_fetch": 0,
"task_max_waiting_in_queue_millis": 0,
"active_shards_percent_as_number": 100
}
The index field mappings span almost 1000 lines - is that even necessary? I'd rather attach a shortened version, if that's of any help...
@ngc4579 Do you have consistent repro as in when you disable this setting, everything runs fine and you start to see NPE once you enable the setting back again.
cluster.allocator.existing_shards_allocator.batch_enabled: true
Also please enable debug logs for OpenSearch process.
Describe the bug
We've just updated OpenSearch to version 2.15.0 and enabled batching for the
ExistingShardsAllocator
:After that, every now and then one of our clients feeding documents into OpenSearch started throwing exceptions with this root cause reported:
Unfortunately, this is the only log line I got - OpenSearch actually does not log any information. The line above gets reported by an OpenSearch Connector pushing data from Kafka Topics in to OpenSearch.
The entire stack trace from the client (OpenSearch Connector for Kafka) looks like this:
Related component
Other
To Reproduce
I'm actually unsure how to reproduce this - the issue appeared after enabling
cluster.allocator.existing_shards_allocator.batch_enabled
.Index configuration:
Cluster configuration:
Expected behavior
No exception thrown.
Additional Details
No response