elastic / elasticsearch-java

Official Elasticsearch Java Client
Apache License 2.0
6 stars 242 forks source link

Cluster allocation explain request failed with Missing required property 'NodeAllocationExplanation.deciders' #453

Open slovdahl opened 1 year ago

slovdahl commented 1 year ago

Java API client version

7.17.6

Java version

1.8.0_345

Elasticsearch Version

7.17.6

Problem description

Unfortunately I don't have the raw JSON response, I just saw the exception below in CI.

co.elastic.clients.json.JsonpMappingException: Error deserializing co.elastic.clients.elasticsearch.cluster.AllocationExplainResponse: co.elastic.clients.util.MissingRequiredPropertyException: Missing required property 'NodeAllocationExplanation.deciders' (JSON path: node_allocation_decisions[0]) (line no=1, column no=1353, offset=-1)
    at co.elastic.clients.json.JsonpMappingException.from0(JsonpMappingException.java:134)
    at co.elastic.clients.json.JsonpMappingException.from(JsonpMappingException.java:125)
    at co.elastic.clients.json.JsonpDeserializerBase$ArrayDeserializer.deserialize(JsonpDeserializerBase.java:320)
    at co.elastic.clients.json.JsonpDeserializerBase$ArrayDeserializer.deserialize(JsonpDeserializerBase.java:280)
    at co.elastic.clients.json.JsonpDeserializer.deserialize(JsonpDeserializer.java:75)
    at co.elastic.clients.json.ObjectDeserializer$FieldObjectDeserializer.deserialize(ObjectDeserializer.java:71)
    at co.elastic.clients.json.ObjectDeserializer.deserialize(ObjectDeserializer.java:180)
    at co.elastic.clients.json.ObjectDeserializer.deserialize(ObjectDeserializer.java:136)
    at co.elastic.clients.json.JsonpDeserializer.deserialize(JsonpDeserializer.java:75)
    at co.elastic.clients.json.ObjectBuilderDeserializer.deserialize(ObjectBuilderDeserializer.java:79)
    at co.elastic.clients.json.DelegatingDeserializer$SameType.deserialize(DelegatingDeserializer.java:43)
    at co.elastic.clients.transport.rest_client.RestClientTransport.decodeResponse(RestClientTransport.java:328)
    at co.elastic.clients.transport.rest_client.RestClientTransport.getHighLevelResponse(RestClientTransport.java:294)
    at co.elastic.clients.transport.rest_client.RestClientTransport.access$200(RestClientTransport.java:63)
    at co.elastic.clients.transport.rest_client.RestClientTransport$1.onSuccess(RestClientTransport.java:168)
    at org.elasticsearch.client.RestClient$FailureTrackingResponseListener.onSuccess(RestClient.java:678)
    at org.elasticsearch.client.RestClient$1.completed(RestClient.java:399)
    at org.elasticsearch.client.RestClient$1.completed(RestClient.java:393)
    at org.apache.http.concurrent.BasicFuture.completed(BasicFuture.java:122)
    at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.responseCompleted(DefaultClientExchangeHandlerImpl.java:181)
    at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.processResponse(HttpAsyncRequestExecutor.java:448)
    at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.inputReady(HttpAsyncRequestExecutor.java:338)
    at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:265)
    at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81)
    at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39)
    at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114)
    at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162)
    at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337)
    at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)
    at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)
    at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
    at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591)
    at java.lang.Thread.run(Thread.java:750)
    Caused by:
    co.elastic.clients.util.MissingRequiredPropertyException: Missing required property 'NodeAllocationExplanation.deciders'
        at co.elastic.clients.util.ApiTypeHelper.requireNonNull(ApiTypeHelper.java:76)
        at co.elastic.clients.util.ApiTypeHelper.unmodifiableRequired(ApiTypeHelper.java:141)
        at co.elastic.clients.elasticsearch.cluster.allocation_explain.NodeAllocationExplanation.<init>(NodeAllocationExplanation.java:76)
        at co.elastic.clients.elasticsearch.cluster.allocation_explain.NodeAllocationExplanation.<init>(NodeAllocationExplanation.java:54)
        at co.elastic.clients.elasticsearch.cluster.allocation_explain.NodeAllocationExplanation$Builder.build(NodeAllocationExplanation.java:350)
        at co.elastic.clients.elasticsearch.cluster.allocation_explain.NodeAllocationExplanation$Builder.build(NodeAllocationExplanation.java:212)
        at co.elastic.clients.json.ObjectBuilderDeserializer.deserialize(ObjectBuilderDeserializer.java:86)
        at co.elastic.clients.json.DelegatingDeserializer$SameType.deserialize(DelegatingDeserializer.java:48)
        at co.elastic.clients.json.JsonpDeserializerBase$ArrayDeserializer.deserialize(JsonpDeserializerBase.java:316)
        ... 30 more

I skimmed through the changes between 7.17.6 and 7.17.7 and AFAICT this specific problem has not been fixed since 7.17.6. We do a client.cluster().allocationExplain() request and output the result if index creation fails or if the cluster health is not green in our integration tests, and I think shard allocation had been throttled when this happened. However, right after this exception, the same CI job also successfully parsed an allocation explain response for throttled allocation:

AllocationExplainResponse: {"allocate_explanation":"allocation temporarily throttled","can_allocate":"throttled","current_state":"unassigned","index":"foo-data-rhuxrz","node_allocation_decisions":[{"deciders":[{"decider":"throttling","decision":"THROTTLE","explanation":"reached the limit of ongoing initial primary recoveries [4], cluster setting [cluster.routing.allocation.node_initial_primaries_recoveries=4]"}],"node_attributes":{"ml.machine_memory":"33726533632","xpack.installed":"true","transform.node":"true","ml.max_open_jobs":"512","ml.max_jvm_size":"536870912"},"node_decision":"throttled","node_id":"tXOHokkbSuG5UhgyQ8Be_w","node_name":"7a2795f847d4","transport_address":"172.19.0.2:9300","weight_ranking":1}],"primary":true,"shard":2,"unassigned_info":{"at":"2022-11-28T13:01:22.818Z","last_allocation_status":"throttled","reason":"INDEX_CREATED"},"note":"No shard was specified in the explain API request, so this response explains a randomly chosen unassigned shard. There may be other unassigned shards in this cluster which cannot be assigned for different reasons. It may not be possible to assign this shard until one of the other shards is assigned correctly. To explain the allocation of other shards (whether assigned or unassigned) you must specify the target shard in the request to this API."}
l-trotta commented 8 months ago

Hello, thanks for the report! I'll have to investigate this a bit more because NodeAllocationExplanation.deciders seems to be required server side, so I'm not sure how it's missing, also while trying to replicate it I got another error saying that NodeAllocationExplanation.weightRanking is missing... which should be required as well! In any case it looks like an API specification bug, once solved the java client will be generated to fix the issue.