Sunyifan / java-gearman-service

Automatically exported from code.google.com/p/java-gearman-service
0 stars 0 forks source link

Gearman worker stopped connecting to server #41

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?

Don't know how to reproduce the problem, occurred after worker had been running 
for many days.

What is the expected output? What do you see instead?

I expect when the worker has a problem connecting to the server it will handle 
the error gracefully and reopen the connection.  Instead it never reconnected, 
requiring a restart of the service.

What version of the product are you using? On what operating system?
a build from source, revision 170.

Please provide any additional information below.

Basically the worker started printing out these errors in the warning:

[2012-11-14 06:43:40,543] 187793395 WARN  [gearman-3148] 
gearman.closeConnection[-1] - Failed to close connection
java.nio.channels.ClosedChannelException: null
    at sun.nio.ch.AsynchronousSocketChannelImpl.begin(AsynchronousSocketChannelImpl.java:114) ~[na:1.7.0_06]
    at sun.nio.ch.AsynchronousSocketChannelImpl.shutdownOutput(AsynchronousSocketChannelImpl.java:528) ~[na:1.7.0_06]
    at org.gearman.impl.reactor.SocketImpl.closeConnection(Unknown Source) [java-gearman-service-0.6.6.jar:na]
    at org.gearman.impl.reactor.SocketImpl.completed(Unknown Source) [java-gearman-service-0.6.6.jar:na]
    at org.gearman.impl.reactor.SocketImpl.completed(Unknown Source) [java-gearman-service-0.6.6.jar:na]
    at sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:126) [na:1.7.0_06]
    at sun.nio.ch.Invoker$2.run(Invoker.java:206) [na:1.7.0_06]
    at sun.nio.ch.AsynchronousChannelGroupImpl$1.run(AsynchronousChannelGroupImpl.java:112) [na:1.7.0_06]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) [na:1.7.0_06]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) [na:1.7.0_06]
    at java.lang.Thread.run(Thread.java:722) [na:1.7.0_06]

it printed this out a total of 5 or so times in a time span of 6 hours or so 
before the error was discovered and the worker restarted.  It reconnected to 
the gearman server fine, but it should really be able to recover from this and 
properly connect to the server.  I'm sorry I can't give more info on the issue, 
I'm not entirely sure what it is either.

Original issue reported on code.google.com by edwin.fu...@livestream.com on 14 Nov 2012 at 4:17

GoogleCodeExporter commented 9 years ago
well, it happens here too. mainly at 4 AM when the server is under a heavy I/O 
load getting backups and copying files.

Original comment by zirix...@gmail.com on 30 Apr 2014 at 7:00