Closed anilkumaryadavalli closed 4 weeks ago
This issue has been marked as stale due to 280 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions.
This issue has been closed due to lack of activity. If you think that is incorrect, or the issue requires additional review, you can revive the issue at any time.
Affected Version
We are running Druid 26 version with zookeeper less discovery (without Zookeeper) in GKE kubernetes cluster.
Description
Index_kafka supervisor removing one middle manager for each index_kafka task and throwing below logs from middle manager the tasks which killed manually also removing one middle manager.
Configurations in use
Steps to reproduce the problem 1.In GKE (Kubernetes) we have 300 pods in total, middle manager count shows 292(we already ran 8 index_kafka tasks which removed 5 middle managers) 2.Run index_kafka task, peon will get created to run the task 3.once the task is killed manually or completed, middle manager will disappear. Note: GKE Kubernetes middle manger pod count doesn't reduce
The error message or stack traces encountered. Providing more context, such as nearby log messages or even entire logs, can be helpful. 2023-11-07T02:14:22,936 DEBUG [HttpClient-Netty-Worker-18] org.apache.druid.java.util.http.client.NettyHttpClient - [POST http://x.x.x.x:8100/druid/worker/v1/chat/index_kafka_test_8d68cb20ddbdbc0_bilipega/offsets/end?finish=true] Got chunk: 0B, last=true 2023-11-07T02:14:22,936 DEBUG [ServiceClientFactory-2] org.apache.druid.rpc.ServiceClientImpl - Service [index_kafka_test_8d68cb20ddbdbc0_bilipega] request [POST http://x.x.x.x:8100/druid/worker/v1/chat/index_kafka_test_8d68cb20ddbdbc0_bilipega/offsets/end?finish=true] completed. 2023-11-07T02:14:23,015 INFO [org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatchermiddleManager] org.apache.druid.discovery.BaseNodeRoleWatcher
Node [http://x.x.x.x:8088/] of role [middleManager] went offline. 2023-11-07T02:14:23,015 INFO [K8sDruidNodeDiscoveryProvider-ListenerExecutor] org.apache.druid.indexing.overlord.hrtr.HttpRemoteTaskRunner - Kaboom! Worker[x.x.x.x:8088] removed! 2023-11-07T02:14:23,015 INFO [K8sDruidNodeDiscoveryProvider-ListenerExecutor] org.apache.druid.server.coordination.ChangeRequestHttpSyncer - Stopping ChangeRequestHttpSyncer[http://x.x.x.x:8088/_1698949782254]. 2023-11-07T02:14:23,015 INFO [K8sDruidNodeDiscoveryProvider-ListenerExecutor] org.apache.druid.server.coordination.ChangeRequestHttpSyncer - Stopped ChangeRequestHttpSyncer[http://x.x.x.x:8088/_1698949782254]. 2023-11-07T02:14:32,593 DEBUG [ServiceClientFactory-2]
Any debugging that you have already done 1.We tried upgrading to Druid 27 version from 26 , didn't fix the issue. 2.Tried with Druid 28 version, didn't fix the issue.