Currently, dolphin pushes model gradients for keys one by one.
It incurs much overhead.
Especially it becomes worse with larger number of keys and smaller size of mini-batch.
Sometimes, sender queue becomes full and computation is blocked by push() call.
We can optimize it by using multi-update API that batching requests for a same block.
Currently, dolphin pushes model gradients for keys one by one. It incurs much overhead.
Especially it becomes worse with larger number of keys and smaller size of mini-batch. Sometimes, sender queue becomes full and computation is blocked by push() call.
We can optimize it by using multi-update API that batching requests for a same block.