After multiple tests to profile the batch_submit method performance, I saw that bottleneck is the serialization in the distibuted.protocol . Basically when calling an attribute in an actor we use the RPCPool to send a send_recv call on communication layer which calls in return the protocol.dumps().
Msg size : 717
Protocol serialization : into list of 100002 protocol frames took 5.31s
Simple cloudpickle.dumps() of message took 0.12s
This is too slow when submitting millions of tasks + I can optimize the storage layer to directly deal with serialized message.
I'll test using ZMQ for communication between client <-> QueuePool and Queue<-> workers.
After multiple tests to profile the
batch_submit
method performance, I saw that bottleneck is the serialization in thedistibuted.protocol
. Basically when calling an attribute in an actor we use the RPCPool to send asend_recv
call on communication layer which calls in return theprotocol.dumps()
.This is too slow when submitting millions of tasks + I can optimize the storage layer to directly deal with serialized message.
I'll test using ZMQ for communication between client <-> QueuePool and Queue<-> workers.