What steps will reproduce the problem?
1. Start a rebalance job for a cluster
2. get the list of async jobs
3. stop the list of async jobs for a node
All jobs will stop, but the voldemort-admin-tool.sh will sometimes hang waiting
on a particular job to stop and it may never return.
Running 0.95.1 for both server and admin tool.
[user@host voldemort]$ bin/voldemort-admin-tool.sh --async stop --async-id
0,13,14,15,16,17,18,19,20,21,22,25,26,27,28,29 --node 2 --url tcp://server1:6666
[2012-03-29 19:33:33,603
voldemort.store.socket.clientrequest.ClientRequestExecutorFactory$ClientRequestS
electorManager] INFO Closed, exiting
[2012-03-29 19:33:33,603
voldemort.store.socket.clientrequest.ClientRequestExecutorFactory$ClientRequestS
electorManager] INFO Closed, exiting
[2012-03-29 19:33:33,603
voldemort.store.socket.clientrequest.ClientRequestExecutorFactory$ClientRequestS
electorManager] INFO Closed, exiting
[2012-03-29 19:33:33,603
voldemort.store.socket.clientrequest.ClientRequestExecutorFactory$ClientRequestS
electorManager] INFO Closed, exiting
[2012-03-29 19:33:33,603
voldemort.store.socket.clientrequest.ClientRequestExecutorFactory$ClientRequestS
electorManager] INFO Closed, exiting
[2012-03-29 19:33:33,603
voldemort.store.socket.clientrequest.ClientRequestExecutorFactory$ClientRequestS
electorManager] INFO Closed, exiting
[2012-03-29 19:33:33,603
voldemort.store.socket.clientrequest.ClientRequestExecutorFactory$ClientRequestS
electorManager] INFO Closed, exiting
[2012-03-29 19:33:33,603
voldemort.store.socket.clientrequest.ClientRequestExecutorFactory$ClientRequestS
electorManager] INFO Closed, exiting
[2012-03-29 19:33:33,607
voldemort.store.socket.clientrequest.ClientRequestExecutor] WARN No client
associated with Socket[unconnected]
[2012-03-29 19:33:33,607
voldemort.store.socket.clientrequest.ClientRequestExecutor] INFO Closing remote
connection from Socket[unconnected]
Stopping async id 0
Stopped async id 0
Stopping async id 13
It never progressed beyond job 13, but in another terminal if I do an --async
get, I see no jobs running on any of the nodes.
Original issue reported on code.google.com by dremlok on 29 Mar 2012 at 7:56
Original issue reported on code.google.com by
dremlok
on 29 Mar 2012 at 7:56