johnewart / gearman-java

Java Gearman client and Netty-based server with multi-backend support (currently memory, Redis and PostgreSQL). More features coming including clustering support
Apache License 2.0
66 stars 31 forks source link

GearmanWorker::work() causes NullPointerException and infinite loop. #4

Closed tboloo closed 9 years ago

tboloo commented 9 years ago

Setup:

java 1.8.0_45
libgearman version => 1.1.12
PHP 5.6.7-1~dotdeb
gearman server 0.8.10-20150218.030748-1

Sample worker (taken from here ) When trying to run the worker (more precisely when $gmworker->work() is called) server fails with NullPointerException, which seems to cause the infinite loop, with information as below:

20:39:36.157 - [nioEventLoopGroup-3-1] DEBUG n.j.gearman.server.net.PacketHandler - Creating new handler!
20:39:36.160 - [nioEventLoopGroup-3-1] DEBUG io.netty.util.ResourceLeakDetector - -Dio.netty.noResourceLeakDetection: false
20:39:36.166 - [nioEventLoopGroup-3-1] DEBUG n.j.gearman.server.net.Decoder - ---> OPTION_REQ
20:39:36.166 - [nioEventLoopGroup-3-1] DEBUG n.j.gearman.server.net.PacketHandler -  ---> OPTION_REQ
20:39:36.167 - [nioEventLoopGroup-3-1] DEBUG n.j.gearman.server.net.Encoder - <--- OPTION_RES
20:39:36.170 - [nioEventLoopGroup-3-1] DEBUG n.j.gearman.server.net.Decoder - ---> SUBMIT_JOB
20:39:36.170 - [nioEventLoopGroup-3-1] DEBUG n.j.gearman.server.net.PacketHandler -  ---> SUBMIT_JOB
20:39:36.173 - [nioEventLoopGroup-3-1] DEBUG n.j.g.engine.queue.PersistedJobQueue - Enqueueing H:localhost:1
20:39:36.174 - [nioEventLoopGroup-3-1] DEBUG n.j.gearman.server.net.Encoder - <--- JOB_CREATED
20:39:46.349 - [nioEventLoopGroup-3-2] DEBUG n.j.gearman.server.net.PacketHandler - Creating new handler!
20:39:46.349 - [nioEventLoopGroup-3-2] DEBUG n.j.gearman.server.net.Decoder - ---> OPTION_REQ
20:39:46.349 - [nioEventLoopGroup-3-2] DEBUG n.j.gearman.server.net.PacketHandler -  ---> OPTION_REQ
20:39:46.350 - [nioEventLoopGroup-3-2] DEBUG n.j.gearman.server.net.Encoder - <--- OPTION_RES
20:39:46.350 - [nioEventLoopGroup-3-2] DEBUG n.j.gearman.server.net.Decoder - ---> CAN_DO
20:39:46.350 - [nioEventLoopGroup-3-2] DEBUG n.j.gearman.server.net.PacketHandler -  ---> CAN_DO
20:39:46.354 - [nioEventLoopGroup-3-2] WARN  n.j.gearman.server.net.PacketHandler - Unexpected exception from downstream.
io.netty.handler.codec.DecoderException: java.lang.NullPointerException
at io.netty.handler.codec.ReplayingDecoder.callDecode(ReplayingDecoder.java:415) ~[gearman-server-0.8.10-20150218.030748-1.jar:0.4]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:131) ~[gearman-server-0.8.10-20150218.030748-1.jar:0.4]
at io.netty.channel.DefaultChannelHandlerContext.invokeChannelRead(DefaultChannelHandlerContext.java:334) [gearman-server-0.8.10-20150218.030748-1.jar:0.4]
at io.netty.channel.DefaultChannelHandlerContext.fireChannelRead(DefaultChannelHandlerContext.java:320) [gearman-server-0.8.10-20150218.030748-1.jar:0.4]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:785) [gearman-server-0.8.10-20150218.030748-1.jar:0.4]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:100) [gearman-server-0.8.10-20150218.030748-1.jar:0.4]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:497) [gearman-server-0.8.10-20150218.030748-1.jar:0.4]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:465) [gearman-server-0.8.10-20150218.030748-1.jar:0.4]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:359) [gearman-server-0.8.10-20150218.030748-1.jar:0.4]
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101) [gearman-server-0.8.10-20150218.030748-1.jar:0.4]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
Caused by: java.lang.NullPointerException: null
at net.johnewart.gearman.common.packets.PacketFactory.packetFromBytes(PacketFactory.java:38) ~[gearman-server-0.8.10-20150218.030748-1.jar:0.4]
at net.johnewart.gearman.server.net.Decoder.decode(Decoder.java:59) ~[gearman-server-0.8.10-20150218.030748-1.jar:0.4]
at io.netty.handler.codec.ReplayingDecoder.callDecode(ReplayingDecoder.java:360) ~[gearman-server-0.8.10-20150218.030748-1.jar:0.4]
... 10 common frames omitted
20:39:46.355 - [nioEventLoopGroup-3-2] DEBUG n.j.gearman.server.net.PacketHandler - Client closed channel: [id: 0x7b6fd73a, /127.0.0.1:46765 :> 0.0.0.0/0.0.0.0:4730]
20:39:46.365 - [nioEventLoopGroup-3-3] DEBUG n.j.gearman.server.net.PacketHandler - Creating new handler!
20:39:46.365 - [nioEventLoopGroup-3-3] DEBUG n.j.gearman.server.net.Decoder - ---> OPTION_REQ
20:39:46.365 - [nioEventLoopGroup-3-3] DEBUG n.j.gearman.server.net.PacketHandler -  ---> OPTION_REQ
20:39:46.366 - [nioEventLoopGroup-3-3] DEBUG n.j.gearman.server.net.Encoder - <--- OPTION_RES
20:39:46.366 - [nioEventLoopGroup-3-3] DEBUG n.j.gearman.server.net.Decoder - ---> CAN_DO
20:39:46.367 - [nioEventLoopGroup-3-3] DEBUG n.j.gearman.server.net.PacketHandler -  ---> CAN_DO
20:39:46.368 - [nioEventLoopGroup-3-3] WARN  n.j.gearman.server.net.PacketHandler - Unexpected exception from downstream.
io.netty.handler.codec.DecoderException: java.lang.NullPointerException

BTW client and worker work as expected on gearmand

johnewart commented 9 years ago

Sorry this is so delayed; this is a result of the C++ library deviating a bit from the existing standard without documenting it -- the PHP client wraps the C++ library and so I will need to support a few other message types:

SUBMIT_REDUCE_JOB, 
SUBMIT_REDUCE_JOB_BACKGROUND,
GRAB_JOB_ALL,           
JOB_ASSIGN_ALL,        
GET_STATUS_UNIQUE,      
STATUS_RES_UNIQUE;      
tboloo commented 9 years ago

Cool, when do expect to publish the updated jar?

johnewart commented 9 years ago

It looks like just responding to GRAB_JOB_ALL with a standard JOB_ASSIGN_* message makes the PHP worker happy. You can build master yourself or I can publish a JAR for you somewhere if that helps you test it.

My worker looks like this:

<?php

echo "Starting\n";

# Create our worker object.
$gmworker= new GearmanWorker();

# Add default server (localhost).
$gmworker->addServer();

# Register function "reverse" with the server. Change the worker function to
# "reverse_fn_fast" for a faster worker with no output.
$gmworker->addFunction("reverse", "reverse_fn");

print "Waiting for job...\n";
while($gmworker->work())
{
  if ($gmworker->returnCode() != GEARMAN_SUCCESS)
  {
    echo "return_code: " . $gmworker->returnCode() . "\n";
    break;
  }
}

function reverse_fn($job)
{
  echo "Received job: " . $job->handle() . "\n";

  $workload = $job->workload();
  $workload_size = $job->workloadSize();

  echo "Workload: $workload ($workload_size)\n";

  # This status loop is not needed, just showing how it works
  for ($x= 0; $x < $workload_size; $x++)
  {
    echo "Sending status: " . ($x + 1) . "/$workload_size complete\n";
    $job->sendStatus($x+1, $workload_size);
    $job->sendData(substr($workload, $x, 1));
    sleep(1);
  }

  $result= strrev($workload);
  echo "Result: $result\n";

  # Return what we want to send back to the client.
  return $result;
}

# A much simpler and less verbose version of the above function would be:
function reverse_fn_fast($job)
{
  return strrev($job->workload());
}

?>
tboloo commented 9 years ago

I would be great if you could help me, and publish the JAR.

johnewart commented 9 years ago

https://oss.sonatype.org/content/repositories/snapshots/net/johnewart/gearman/gearman-server/0.8.11-SNAPSHOT/gearman-server-0.8.11-20150731.182506-1.jar

tboloo commented 9 years ago

Works like a charm, thanks. BTW, I have created Docker environment for easier and environment-independent testing, perhaps you find it useful.