Closed karthick-rn closed 4 years ago
@karthick-rn thanks for the updates! I want to give the updates a try on my local dev environment but I recently changed from fedora to ubuntu so I need to get docker setup first.
I am getting a NoClassDefFoundError related to zookeeper when trying to run the command.
java.util.ServiceConfigurationError: org.apache.accumulo.start.spi.KeywordExecutable: Provider org.apache.accumulo.server.init.Initialize could not be instantiated
at java.util.ServiceLoader.fail(ServiceLoader.java:232)
at java.util.ServiceLoader.access$100(ServiceLoader.java:185)
at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:384)
at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
at org.apache.accumulo.start.Main.checkDuplicates(Main.java:243)
at org.apache.accumulo.start.Main.getExecutables(Main.java:234)
at org.apache.accumulo.start.Main.main(Main.java:88)
Caused by: java.lang.NoClassDefFoundError: org/apache/zookeeper/KeeperException
at java.lang.Class.getDeclaredConstructors0(Native Method)
at java.lang.Class.privateGetDeclaredConstructors(Class.java:2671)
at java.lang.Class.getConstructor0(Class.java:3075)
at java.lang.Class.newInstance(Class.java:412)
at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:380)
... 5 more
Caused by: java.lang.ClassNotFoundException: org.apache.zookeeper.KeeperException
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
... 10 more
@milleruntime Regarding the exception, can you add the /lib
directory as shown below to the classpath & try running the same command.
CLASSPATH="${CLASSPATH}:${lib}/*:${HADOOP_CONF_DIR}:${ZOOKEEPER_HOME}/*:${HADOOP_HOME}/share/hadoop/client/*:${ZOOKEEPER_HOME}/lib/*"
While running the image I noticed the difference in the ZK tarball filename and corrected it & updated the PR. After this change, I tested the image and able to run ZK, Hadoop & Accumulo successfully.
@karthick-rn I am not sure how you are setting the classpath but here is what I am seeing:
~/workspace/accumulo-docker$ docker run accumulo classpath
/opt/accumulo/conf:/opt/accumulo/lib/*:/opt/hadoop/etc/hadoop:/opt/zookeeper/*:/opt/hadoop/share/hadoop/client/*
Then any other command throws the error:
~/workspace/accumulo-docker$ docker run accumulo --help
2020-04-08 15:53:55,004 [start.Main] ERROR: Uncaught exception
...
OK I forgot the fixes for the newer version of ZK haven't been released yet. I think the dockerfile should work out of the box with the last release of Accumulo, version 2.0.0. So the ZK version should be 3.4.14 or if it works with 3.5.7 then that is fine.
It would have the same problems with 3.5.7. So, it should probably stay 3.4.14, or have some post-script to modify the class path in the Accumulo env script.
It would have the same problems with 3.5.7. So, it should probably stay 3.4.14, or have some post-script to modify the class path in the Accumulo env script.
I discussed this with @milleruntime on Slack. I'm keeping the ZK version as 3.6 & adding the accumulo-env.sh
as a post-script like you mentioned. I have updated the PR with these changes, please test and let me know. Thanks
It would have the same problems with 3.5.7. So, it should probably stay 3.4.14, or have some post-script to modify the class path in the Accumulo env script.
I discussed this with @milleruntime on Slack. I'm keeping the ZK version as 3.6 & adding the
accumulo-env.sh
as a post-script like you mentioned. I have updated the PR with these changes, please test and let me know. Thanks
I like having the accumulo-env.sh as a post-script but I still think we should make the dockerfile work with 2.0.0 out of the box. So this means the ZK veresion should be 3.4.14. Sorry for misleading you.
@karthick-rn Since this change is really to make docker work for version 2.1.0, if you want to keep this PR open for once we release 2.1.0 that is fine. We can make a separate change for now to simply update the version from alpha to 2.0.0.
@karthick-rn Since this change is really to make docker work for version 2.1.0, if you want to keep this PR open for once we release 2.1.0 that is fine. We can make a separate change for now to simply update the version from alpha to 2.0.0.
@milleruntime Sorry if I wasn't clear. This change is to make docker to work with Accumulo 2.0 & not 2.1. The reason for having the accumulo-env.sh
as a post-script is to address the classpath issue in Accumulo 2.0 to work with ZK versions 3.5.x & above. This classpath issue has been fixed in Accumulo 2.1 & we no longer need the post-script when we are running docker on Accumulo 2.1. Let me know if you are seeing any issues in Dockerfile running Accumulo 2.0? Also, if you still think you might need ZK 3.4.14 I'm happy to make the changes. Thanks
@karthick-rn I wonder if adding ENV CLASSPATH /opt/zookeeper/lib/*
towards the end of the dockerfile instead of adding the accumulo-env.sh file would work. This should be picked up by Accumulo env, the only drawback is that we dont get control over the order of where /opt/zookeeper/lib/*
is in the classpath.
@karthick-rn I wonder if adding
ENV CLASSPATH /opt/zookeeper/lib/*
towards the end of the dockerfile instead of adding the accumulo-env.sh file would work. This should be picked up by Accumulo env, the only drawback is that we dont get control over the order of where/opt/zookeeper/lib/*
is in the classpath.
I like @keith-turner 's suggestion here. It's better than copying the env script in.
@milleruntime I'm not following your use of "out-of-the-box" here. The way I see it, the Docker packaging is its own "box" (packaging). Accumulo 2.0.0 works fine, when properly configured, with ZK 3.5 or 3.6. So, it should be fine if the Docker packaging configures it that way. The only real difference is that you'll need to have the newer ZK running to use the Docker packaging of Accumulo, whereas if you were to package or deploy Accumulo manually, you could do so using a version as low as 3.4. But, I think it's fine for Docker to require newer.
@karthick-rn I wonder if adding
ENV CLASSPATH /opt/zookeeper/lib/*
towards the end of the dockerfile instead of adding the accumulo-env.sh file would work. This should be picked up by Accumulo env, the only drawback is that we dont get control over the order of where/opt/zookeeper/lib/*
is in the classpath.
Thanks for the suggestion @keith-turner. I'm trying this now, will update soon
@milleruntime I'm not following your use of "out-of-the-box" here.
A developer should be able to check out accumulo-docker, follow the README and build and run a 2.0.0 container. If they require additional steps like setting the classpath to work then we shouldn't bake the version in the dockerfile. The versions in the dockerfile should work without extra steps.
@milleruntime I'm not following your use of "out-of-the-box" here.
A developer should be able to check out accumulo-docker, follow the README and build and run a 2.0.0 container. If they require additional steps like setting the classpath to work then we shouldn't bake the version in the dockerfile. The versions in the dockerfile should work without extra steps.
@milleruntime Oh, okay. So, I think you mean accumulo-docker
, "out-of-the-box", should work. I previously thought you meant something like accumulo-docker
should use "out-of-the-box" accumulo
. If you mean the former, then I agree. But, all of the proposed changes in this PR satisfy that, so I don't see the problem.
Since @milleruntime mentioned the README, it's worth noting that this change would make some versions of ZooKeeper not work at all (3.4 and earlier; mainly because ZooKeeper renamed their tarballs). However, without this change, the newer versions won't work. I'm inclined to prioritize the newer versions.
In either case, the examples in the README should avoid using versions that won't work, and should mention that some combinations of ZooKeeper and Hadoop versions may not work.
My only concern with this change is that we are creating more work for when we release 2.1. But if I understand the updates correctly, that won't be a problem since we would have to update the path to the new ZK binaries anyway.
My only concern with this change is that we are creating more work for when we release 2.1. But if I understand the updates correctly, that won't be a problem since we would have to update the path to the new ZK binaries anyway.
Yeah, this might actually be saving us work (because of the tarball renames). If @keith-turner 's suggestion to use the ENV
directive works, then the only change we'd need to do for 2.1 is to drop that superfluous ENV
line.
Looks like the order of the classpath is important here. When adding ENV CLASSPATH /opt/zookeeper/lib/*
to Dockerfile, Accumulo positions the ENV
directive first in the classpath like below.
/opt/zookeeper/lib/*:/opt/accumulo/conf:/opt/accumulo/lib/*:/opt/hadoop/etc/hadoop:/opt/zookeeper/*:/opt/hadoop/share/hadoop/client/*
This has resulted in the below exception when trying to run queries on ashell.
Exception in thread "shell" java.lang.NoSuchMethodError: org.apache.commons.cli.Options.hasShortOption(Ljava/lang/String;)Z
To maintain the order of the classpath, we tried several options & the one with sed
works and doesn't require the post-script. I have updated the PR with the new changes, please review and let me know your thoughts/suggestions. Thanks
We should vote on a release for this, and then publish it to Dockerhub as described in the README (I think INFRA can help with that). If anyone is interested in pursuing this, please follow up on the dev mailing list.
We should vote on a release for this, and then publish it to Dockerhub as described in the README (I think INFRA can help with that). If anyone is interested in pursuing this, please follow up on the dev mailing list.
@ctubbsii I'm interested, will follow-up on the dev mailing list. Thanks
In this PR, I have updated the Accumulo version along with Hadoop & Zookeeper to match with the latest releases. With these changes, I'm able to successfully build the image. Also, updated the README.md. Once you're happy with these changes, may be I can push the image to
apache/accumulo
at DockerHub. Thanks