Closed paddor closed 8 years ago
@chuckremes would know best here. I've been contemplating a port since higher versions of ZMQ don't work with ffi-rzmq. Would be glad for help, if a port is the best idea.
The main downside I can think of is this would break JRuby support
@tarcieri From my point of view, it's already broken. ffi-rzmq says it supports ZMQ 4.x, and I have 4.1.3 installed, but it doesn't work at all.
It's known that 4.x doesn't work. But 3.x works completely under jRuby.
Sorry, I didn't know that. Where does it say that?
This is known:
@digitalextremist Oh, I see. Thanks for pointing that out.
I saw that there are also https://github.com/zeromq/czmq, https://github.com/Asmod4n/ruby-ffi-czmq, and https://github.com/mtortonesi/ruby-czmq, all of which are based on FFI, which, AFAIK, would guarantee JRuby support. I just have no idea about the differences between them. But the first seems like it's the "official" Ruby binding for CZMQ. Why not use that one?
The official bindings are just thin wrappers around the c code and don't expose any high level ruby functionality, they are the basis for someone who wants to write high level bindings.
The wrapper i wrote is pretty outdated now and is build around assumptions which might no longer be true, it's high level code could be adopted to use the official bindings instead of the abomination i came up with in https://github.com/Asmod4n/ruby-ffi-czmq/blob/master/lib/czmq/libczmq.rb.
Last time i checked https://github.com/mtortonesi/ruby-czmq was broken and doesn't work and it doesn't look like the errors where fixed.
(i also pretty much gave up on ruby as a standalone interpreter and focus much more on mruby now, which has solved the packaging hassle ruby is)
I noticed that second part @asmod4n. Do you mind if we talk separately about that?
Just curious, what benefits are brought by 4.x that are lacking in 3.x? I'm extremely committed to continued feature support on core dependencies, but having done triage a long time by now, I'm prone to prioritize based on gains. What do we gain here, I really want to know. I'm heavily invested in 0MQ.
Mainly security, but also a new wire protocol, the next release will bring thread safe server and client sockets with automatic timeout handling.
I see. Thanks for the explanation.
I don't know how outdated your your wrapper is, or how much work it'd take to adapt it. I am using ZMQ already but not in any sophisticated way (just bi-directional communication between one "broker" and many "clients" (both sides use a ROUTER socket), done with CURVE authentication). Is there much more to know to be able to come up with a nice, Ruby-esque interface and integrate it in your library? :)
What I noticed about your and also @mtortonesi's wrappers is that they don't have any tests. I guess that'd be a good starting point on the road to a stable wrapper?
What is missing is some kind of error handling in the generated wrappers build via zproject (https://github.com/zeromq/zproject/blob/master/zproject_bindings_ruby.gsl) from xml files (https://github.com/zeromq/czmq/tree/master/api).
zproject is the way how the zmq folks tamed automake/cmake et all to build robust APIs around C libraries, which as a byproduct also create wrappers for ruby/python/qt etc.
With zproject for example you can define a class/actor and it automatically generates C skeletons you fill out and get wrappers around them.
Seams like the main culprit has been resolved in https://github.com/chuckremes/ffi-rzmq-core, but looks like the library does stuff the zeromq API doesn't allow: http://api.zeromq.org/4-0:zmq-msg-init.
The API docs explicitly say not to use the zmsg struct, but ffi-rzmq-core does it.
@paddor @digitalextremist looks like its fixed in ffi-zmq-core, see my last post.
@Asmod4n Please provide more information on what you think ffi-rzmq-core or ffi-rzmq are doing wrong. I don't see any problems with my handling of zmq_msg_t structs.
zmq_msg_t
was always exported in a opaque way e.g. its fields have never been part of the official api, it was at first a pointer, then a struct and now a union, its size changed too.
So in essence, ffi-rzmq-core wasn't compatible with libzmq since January.
That also happened because libzmq doesn't define a function to return the size of a zmq_msg_t
, opened a issue for that on the issue tracker: https://github.com/zeromq/libzmq/issues/1599
I'm working on a (hopefully nice) CZMQ binding over at paddor/cztop. Any input or help is welcome.
@paddor that gem looks well conceived and exciting. I will watch with interest and help wherever I can. I am highly dependent on Celluloid::ZMQ and maintain most of the code impacted by your topic here.
@digitalextremist Thanks! That's great to hear. I'll add you as collaborator.
@digitalextremist Providing a way to wait for read/write events from sockets to be able to port Celluloid::ZMQ::Reactor to CZMQ turns out difficult, because waiting for write events isn't straight forward without falling back to using zmq_poll_item
s. Plus, when using zloop, one would have to add and, as soon as the event has been received, immediately remove sockets from the loop, and also keep starting and stopping the zloop, because Celluloid::ZMQ::Reactor apparently uses #run_once
(as opposed to #run
).
Do you know if it's possible to adapt Celluloid::ZMQ::Reactor to support the more low-level kind of loop, where one would call #run
just once?
I had a similar issue with my czmq binding for mruby and wrote my own reactor https://github.com/Asmod4n/mruby-czmq/blob/master/mrblib/reactor.rb https://github.com/Asmod4n/mruby-czmq/blob/master/mrblib/poller.rb
@paddor I believe this is an area where we'd want to be careful to preserve "evented" behavior versus having an infinite loop or similar. See the Celluloid::IO
reactor itself, which behaves the same:
/cc: @tarcieri
@Asmod4n Thanks for the help. Very interesting solution.
@digitalextremist I think I have a solution. I'll extend zpoller with the method zpoller_add_writer
and then implement Celluloid::ZMQ::Reactor#run_once
using a loop that calls zpoller_wait
until it doesn't return any more sockets.
@digitalextremist I'm wondering, how do you manage RSpec's lack of support for Rubinius? I'm running into trouble with it in my CZMQ binding when I run it on Rubinius (and JRuby), but only since I implemented zpoller (yesterday). The issue seems to be related to RSpec, though. How does Celluloid manage this? Thanks for any input.
@paddor we've never had problems using RSpec
on Rubininus
other than the inability to pinpoint the specific Rubinius
version we want to test on Travis CI
... what specific problem are you having?
@digitalextremist Thanks. I was having trouble with CZTop. The zpoller specs failed mainly on Rubinius and JRuby, but then even on MRI. Then I learned that RSpec doesn't officially support Rubinius. I filed zeromq/czmq#1299 and it's fixed now. It wasn't RSpec's fault after all. :-) All CZTop specs run very smoothly on MRI, Rubinius, and JRuby now.
By the way: CZTop::Poller (zpoller) was the last class. CZTop is pretty much complete now. Maybe you wanna give it a look. I know we still need a #run_once
loop/poller thing for Celluloid::ZMQ (can't use CZTop::Poller, as it's only for reading). I've been thinking about creating a gem _cztop-manualloop. Or maybe build it directly into Celluloid::ZMQ. What's your opinion on it?
@digitalextremist I've started porting Celluloid::ZMQ to CZTop.
Do you know if its internal API is used outside of the project? Like, I assume and understand that methods like Celluloid::ZMQ::Socket#read
and #write
are expected to only raise IOError, so Celluloid knows the actor should just crash. But do you know if any other projects depend on methods like Socket#get
/#set
to access a socket's options by passing in integers such as ::ZMQ::RCVTIMEO
?
I've finished porting the library code and the specs (WIP, I guess). When I run the specs, I get a wall of backtraces complaining about loose threads.
Celluloid::ZMQ::Socket
Runaway thread: ================ #<Celluloid::Thread:0x007f828b1e88f8@/Users/paddor/src/ruby/celluloid.git/lib/celluloid/group/spawner.rb:47 sleep>
Backtrace:
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/mailbox.rb:63:in `sleep'
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/mailbox.rb:63:in `wait'
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/mailbox.rb:63:in `block in check'
** /Users/paddor/.gem/ruby/2.3.0/bundler/gems/timers-41145ed260e4/lib/timers/wait.rb:14:in `for'
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/mailbox.rb:58:in `check'
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/actor.rb:155:in `block in run'
** /Users/paddor/.gem/ruby/2.3.0/bundler/gems/timers-41145ed260e4/lib/timers/group.rb:66:in `wait'
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/actor.rb:152:in `run'
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/actor.rb:131:in `block in start'
** /Users/paddor/.gem/ruby/2.3.0/bundler/gems/celluloid-essentials-f0545ce47ed9/lib/celluloid/internals/thread_handle.rb:14:in `block in initialize'
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/actor/system.rb:78:in `block in get_thread'
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/group/spawner.rb:50:in `block in instantiate'
Runaway thread: ================ #<Celluloid::Thread:0x007f828b1da140@/Users/paddor/src/ruby/celluloid.git/lib/celluloid/group/spawner.rb:47 sleep>
Backtrace:
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/mailbox.rb:63:in `sleep'
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/mailbox.rb:63:in `wait'
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/mailbox.rb:63:in `block in check'
** /Users/paddor/.gem/ruby/2.3.0/bundler/gems/timers-41145ed260e4/lib/timers/wait.rb:14:in `for'
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/mailbox.rb:58:in `check'
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/actor.rb:155:in `block in run'
** /Users/paddor/.gem/ruby/2.3.0/bundler/gems/timers-41145ed260e4/lib/timers/group.rb:66:in `wait'
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/actor.rb:152:in `run'
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/actor.rb:131:in `block in start'
** /Users/paddor/.gem/ruby/2.3.0/bundler/gems/celluloid-essentials-f0545ce47ed9/lib/celluloid/internals/thread_handle.rb:14:in `block in initialize'
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/actor/system.rb:78:in `block in get_thread'
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/group/spawner.rb:50:in `block in instantiate'
Runaway thread: ================ #<Celluloid::Thread:0x007f828b1c9110@/Users/paddor/src/ruby/celluloid.git/lib/celluloid/group/spawner.rb:47 sleep>
Backtrace:
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/mailbox.rb:63:in `sleep'
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/mailbox.rb:63:in `wait'
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/mailbox.rb:63:in `block in check'
** /Users/paddor/.gem/ruby/2.3.0/bundler/gems/timers-41145ed260e4/lib/timers/wait.rb:14:in `for'
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/mailbox.rb:58:in `check'
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/actor.rb:155:in `block in run'
** /Users/paddor/.gem/ruby/2.3.0/bundler/gems/timers-41145ed260e4/lib/timers/group.rb:66:in `wait'
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/actor.rb:152:in `run'
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/actor.rb:131:in `block in start'
** /Users/paddor/.gem/ruby/2.3.0/bundler/gems/celluloid-essentials-f0545ce47ed9/lib/celluloid/internals/thread_handle.rb:14:in `block in initialize'
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/actor/system.rb:78:in `block in get_thread'
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/group/spawner.rb:50:in `block in instantiate'
Runaway thread: ================ #<Celluloid::Thread:0x007f828b1a3c58@/Users/paddor/src/ruby/celluloid.git/lib/celluloid/group/spawner.rb:47 sleep>
Backtrace:
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/mailbox.rb:63:in `sleep'
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/mailbox.rb:63:in `wait'
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/mailbox.rb:63:in `block in check'
** /Users/paddor/.gem/ruby/2.3.0/bundler/gems/timers-41145ed260e4/lib/timers/wait.rb:14:in `for'
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/mailbox.rb:58:in `check'
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/actor.rb:155:in `block in run'
** /Users/paddor/.gem/ruby/2.3.0/bundler/gems/timers-41145ed260e4/lib/timers/group.rb:66:in `wait'
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/actor.rb:152:in `run'
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/actor.rb:131:in `block in start'
** /Users/paddor/.gem/ruby/2.3.0/bundler/gems/celluloid-essentials-f0545ce47ed9/lib/celluloid/internals/thread_handle.rb:14:in `block in initialize'
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/actor/system.rb:78:in `block in get_thread'
** /Users/paddor/src/ruby/celluloid.git/lib/celluloid/group/spawner.rb:50:in `block in instantiate'
I don't know what that means. Can anybody help me?
@paddor This is a known problem. For your purposes right now, circumvent that test if you need immediate feedback; otherwise, this is another thing on my immediate to-do list
This is caused outside Celluloid::ZMQ
in Celluloid
itself; more precisely, in its addons to the test suite, brought in basically everywhere in the celluvoid.
I'm trying to recall which issue this problem is being address on, but can't at the moment. But like I said, it's known
It's basically all of them :laughing: I had to change a spec file in celluloid to make it stop after the first, otherwise my scrollback buffer of 10000 lines wasn't enough to see the first one.
Okay, thanks for the info. Btw, how are we standing on breaking the API within Celluloid::ZMQ
? Now or never? Not at all?
And does it really always have to be IOError
if something goes wrong? CZTop raises appropriate exceptions, like ArgumentError
for EINVAL
, Interrupt
for EINTR
, SocketError
for EHOSTUNREACH
, or subclasses of SystemCallError
for other errno
s from ZMQ.
Problem has been solved here https://github.com/celluloid/celluloid/blob/master/spec/support/configure_rspec.rb#L51-L64, but only for celluloid-io. Try to see whether replacing boot with init in ZMQ group solves it.
Thank you @TiagoCardoso1983
Any updates on this?
I'd be interested in an update as well. I've tried a few times over the past year to use (C)ZMQ with Celluloid, and had a mixed bag of results. Generally stopped a while back because ZMQ 4.0 was installed on my OS by other dependencies, but none of the existing gems supported it. CZTop is the only one so far that seems to work out of the box (based on very limited testing, but that is light years ahead of other that complain about stack smashing or fail assertions all the time).
EDIT -- For the record, if there is something you haven't had time to look at, I'd be happy to take a look.
@rfestag If you have issues with CZTop, just let me know. Happy to get feedback. :)
CZTop::Poller is now implemented based on the zmq_poller_*()
functions and thus also works with thread-safe sockets such as SERVER/CLIENT/RADIO/DISH. Furthermore, CZTop::Poller::Aggregated is the one that can be used in Celluloid::ZMQ, as it provides #readables
and #writables
(arrays of sockets) after just one call to #wait
.
Has that Runaway thread
issue been fixed?
I noticed there is a cztop 0.3.0, but it doesn't appear to work with the version of czmq I have access to via AUR on my Manjaro (Arch derivative) system - 3.0.2-1. What version of czmq is necessary to use the latest cztop?
I also checked out your fork of celluloid-zmq a while back, and it doesn't look like it has been updated. I assume the reactor should use CZTop::Poller::Aggregated instead if CZTop::Poller? Or does that not matter?
As far as I can tell, it looks like the Runaway thread issue issue is still happening when I run the specs.
What version of czmq is necessary to use the latest cztop?
@rfestag Because CZTop is pretty new, and supporting older versions was becoming impossible, I decided to drop support for ZMQ < 4.2 (soon to be released) and CZMQ 3.0.2 (current stable release, next release coming soon too, I guess). If you're on OSX and use Homebrew, you can install both using brew install zmq --with-libsodium --HEAD && brew install czmq --HEAD
.
I assume the reactor should use CZTop::Poller::Aggregated instead if CZTop::Poller?
Thanks for the heads up on celluloid-zmq. You are completely right. I just released CZTop 0.4.0, which adds some more compatibility on CZTop::Poller::Aggregated, and changed my branch to use that one (see here). I haven't tested this change though, since you said the Runaway thread
issue is still present. 😞
Yes, let's use this issue to collaborate with @paddor and @digitalextremist on replacing ffi-rzmq with cztop in celluloid-zmq.
@chuckremes I don't know if you noticed, but I tried to port this repo to cztop before, back in April. I stopped because of the issue mentioned above. Maybe you can build on top of my work: https://github.com/celluloid/celluloid-zmq/compare/master...paddor:cztop
@paddor does this issue actually persist? The original SIGSEGV
? I'm sure between those of us on the thread, we can squish that in relatively short order. If it was the loose-threads piece, that ought to be resolved, as of the multiplex
branch on Celluloid
I'm still on.
@digitalextremist No idea about the SIGSEV
, since it's been ages. Actually couldn't remember it.
As for the loose threads, I'll try to rerun the test suite then. :)
Great -- but remember, when in doubt, use the multiplex
branch for the time being. Also, I invited you to our Slack
lair just now.
@paddor with jruby-1.7.25
and mri-2.3.1
( with rbx-3.*
currently having its own unrelated problems for now ) all the tests pass with libzmq3
, after #58 was fixed.
@digitalextremist Thank you so much!
I keep getting segfaults from "assertion failed" error messages from programs involving ZMQ. Even just running
rake spec
in this project brings those errors. I tried on OSX 10.10 and 10.11, same result. I tried with and without lib sodium, same result. Then I noticed that maybe it's because https://github.com/chuckremes/ffi-rzmq has been put into maintenance mode. Apparently https://github.com/methodmissing/rbczmq is the way to go. So I'm thinking about porting celluloid-zmq to that library. Is that a good idea? Or is the low-level approach of ffi-rzmq actually needed by celluloid-zmq?