coherence-community / coherence-incubator

Coherence Incubator
38 stars 37 forks source link

When there are multiple processing servers in a cluster, shutdown down one of them may cause task execution to hang #168

Closed lsho closed 5 years ago

lsho commented 5 years ago

Issue Description from the customer (bug 29602104)

Lets say I have two machines (Machine1 and Machine2) joining and participating in a coherence cluster which are termed as Agents in our terminology.

We generally use stand alone Client (No Storage enabled) to submit tasks to cluster above for distributed processing by registering them in overridden coherence-config files. (Provided configuration below for reference:)

Now when the both the machines (agents) are up and running and when I try to submit the task from Client, I am able to successfully submit and see it executing on the server side which is good.

As a disaster scenario, when I made the Machine1 down and submit a task from client on another machine Machine2, my expectation is the other registered Machine2 which is up and running should be successfully picking the task. But it gets stuck and doesn't throw any stack trace.