Closed rtyler closed 9 years ago
I've documented some of this issue after further investigation in this mailing list post
Duplicated here for posterity:
I'm working on bringing the Redstorm library
(https://github.com/jruby-gradle/redstorm) up to speed with some of the recent
updates made possible in JRuby core, primary of which is loading
resources/artifacts nested within a self-contained .jar file. What I'd like,
ideally, is some sort of pre-initialization hook on a supervisor that I can tie
into to properly set up my classpath.
JRuby supports adding jars to the classpath with a `require` statement in such
a manner that a number of Ruby gems will embed a jar file in the gem and then
use `require 'my-library'` just like they might require any other Ruby
dependency.
This presents problems when executing in the Storm runtime environment. I
cannot simply go the shaded/fat-jar route since there is Ruby code which
expects "my-library.jar" to be a file in the Ruby `$LOAD_PATH` instead of
purely relying on the presence of files in the classpath.
This has been discussed previously in threads like this:
<http://grokbase.com/t/gg/storm-user/12c7m3zzve/what-are-the-chances-of-supporting-a-jar-of-jars-in-the-future>
This isn't a problem on topology deployment where I can easily customize the
classpath and loading semantics but when a topology is sent out to supervisor
nodes and started in workers, the "entrypoint" is something I cannot modify or
hack to update the running load path for the JRuby runtime.
Is there a viable workaround or some means of overriding/extending the worker
initialization routines?
The work-around that I've validated works is to include two separate configurations. One jrubyStorm
packs everything into the artifact verbatim (nested jars-in-jars) whereas jrubyStormClasspath
unzips all the dependencies and places the raw class files inside the resulting archive.
This will provide a short term resolution for users
Related jruby-gradle/redstorm#12
In an older Storm cluster, where storm-kafka isn't already baked in, I'm having trouble getting a topology which includes the storm-kafka jar to start properly on supervisor nodes.
The classpath and everything is correct enough to submit and load successfully (on-submit does resolve and make use of storm-kafka).