JuliaInterop / ZMQ.jl

Julia interface to ZMQ
Other
137 stars 58 forks source link

Odd REQ->recv Socket Performance #53

Open binarybana opened 10 years ago

binarybana commented 10 years ago

I've run a few simple ping/pong benchmarks and noticed some weird performance behavior of the REQ socket. Comparing performance to Python's pyzmq, the server (REP socket) performance is about the same, but for the Julia client, the request of the REQ socket reliably takes >100 milliseconds, whereas the first takes about 10ms, and all others are <1ms:

$ julia jobmon/pingcl.jl
elapsed time: 0.014093492 seconds
elapsed time: 0.12233187 seconds
elapsed time: 0.000415424 seconds
elapsed time: 0.000442195 seconds
elapsed time: 0.000316441 seconds
elapsed time: 0.000349937 seconds
elapsed time: 0.000312942 seconds
elapsed time: 0.000283473 seconds
elapsed time: 0.000310993 seconds
...

I've accounted here for garbage collection and whether the server is written in Julia or Python. The Python client has all requests handled in <1ms:

$ python jobmon/pingcl.py
0.000629901885986
0.000393152236938
0.000520944595337
0.000505924224854
0.000555038452148
0.000494956970215
0.000560998916626
0.000461101531982
0.000371932983398
...

I further narrowed this extra time to the recv of the REQ socket by more narrowly bracketing my tic()/toc(), but things look remarkably similar in the code (both libraries are thin wrappings over zmq_msg_recv). And I didn't know where else to look.

stevengj commented 10 years ago

Isn't this just the usual Julia thing where the first time you call a function it takes some time to be compiled?

binarybana commented 10 years ago

That's what my first thought was, but while the first request is slow (I'd imagine from compilation) it's the second call here that is even slower: 10ms, 100ms, <1ms, <1ms ....

stevengj commented 10 years ago

@JeffBezanson, is there something in the memory subsystem that could cause a large latency like this in mid-computation, even though gc_disable() was called?

binarybana commented 10 years ago

Looking at memory allocation along with the times:

$ julia pingcl.jl
elapsed time: 0.011553368 seconds (34280 bytes allocated)
elapsed time: 0.118611181 seconds (3361300 bytes allocated)
elapsed time: 0.000641436 seconds (432 bytes allocated)
elapsed time: 0.000623514 seconds (432 bytes allocated)
elapsed time: 0.000627998 seconds (432 bytes allocated)

$ julia pingcl.jl # <- unrolled loop
elapsed time: 0.107864638 seconds (3244916 bytes allocated)
elapsed time: 0.123715298 seconds (3361348 bytes allocated)
elapsed time: 0.00052556 seconds (480 bytes allocated)
elapsed time: 0.000545088 seconds (480 bytes allocated)

So memory allocation seems to be the problem, and the fact that it is showing up in the second time through recv, and not the first makes me wonder if this is a ZMQ.jl problem rather than normal Julia compilation. And I also don't understand why unrolling the loop makes things worse for the first time through and while leaving the second slow.

But other than being odd, I don't know if this is sufficiently worse than normal Julia initialization time to warrant further investigation, so I'd be okay if the maintainers decided to close this issue.