joshelser / node-accumulo

Using Node.js to ingest into Accumulo via RabbitMQ and Java
19 stars 3 forks source link

Building node-accumulo for Accumulo 1.5 #1

Open cmundi opened 11 years ago

cmundi commented 11 years ago

Thanks for putting your code on github. I'm learning from it! In fact I want to access accumulo from node with a message queue and (boom!) Google gave me your repo. Awesome.

Unfortunately, the java throws an exception, down in Accumulo's implementation of ZookeeperInstance calling on Thrift:

Exception in thread "main" org.apache.accumulo.core.client.AccumuloException: org.apache.thrift.TApplicationException: Internal error processing authenticateUser

The exception occurs during setup in AmqpWebAnalytics.java on this line of code:

this.connector = instance.getConnector(this.username, this.password.getBytes());

Execution never gets around to testing for table existence.

The issue appears to be that getConnector(String user, ByteBuffer pass) is now deprecated in ZooKeeperInstance: http://accumulo.apache.org/1.5/apidocs/org/apache/accumulo/core/client/ZooKeeperInstance.html


Here's my setup:

Accumulo 1.5 ... instance = a01, user=root, password=password. (Really!) Zookeeper 3.4.5 Hadoop 1.2 node 0.10.15

The Accumulo shell works great, so that'snot the problem. I set the accumulo instance name, user and password to match my setup, and then I built the java app.

Any ideas? I'm sure I'll figure this out but maybe this info will help someone else in the meantime.

Thanks!

joshelser commented 11 years ago

Ha, small world.

Yeah, I wrote this before 1.5 was released with the new API calls. You could try updating it to use the new PasswordToken construct (instead of the getBytes on the password String). I would have expected this to still work against 1.5, but there may be something unexpected happening with Accumulo itself.

I'll try to bring this up to date against 1.5.

cmundi commented 11 years ago

Hi Josh. I've done exactly that -- adopted the new PasswordToken and updated the imports. Of course I also had to update the pox.xml to reference the updated versions. Then the fun starts...the compile fails with this:

[ERROR] Failed to execute goal on project node-rabbitmq-webanalytics: Could not resolve dependencies for project accumulo:node-rabbitmq-webanalytics:jar:0.0.1-SNAPSHOT: The following artifacts could not be resolved: javax.jms:jms:jar:1.1, com.sun.jdmk:jmxtools:jar:1.2.1, com.sun.jmx:jmxri:jar:1.2.1: Could not transfer artifact javax.jms:jms:jar:1.1 from/to java.net (https://maven-repository.dev.java.net/nonav/repository): No connector available to access repository java.net (https://maven-repository.dev.java.net/nonav/repository) of type legacy using the available factories WagonRepositoryConnectorFactory -> [Help 1]

The javax.jms complaint popped up as soon as I required Accumulo 1.5 in the pom.xml. I' haven't battled with jms in a long time, so I might have to sleep on this one...

Thanks!

P.S. Yes, the world is small. It's a large graph with small max{edges(e1,e2)}. ;)

cmundi commented 11 years ago

FYI. Updating rabbitmq to 3.1.3 neither helps nor hurts as far as I see so far. I was hoping it might pull in a substitute for jms which does not require JEE. Nope.

<version.amqpclient>3.1.3</version.amqpclient>

UPDATE 1: I'm making progress by excluding the JMS stuff from the ZooKeeper dependency in pom.xml.

UPDATE 2: Also exclude cloudtrace dependency. Now I can compile, but I get another exception:

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/accumulo/fate/zookeeper/ZooCache
    at accumulo.AmqpWebAnalytics.setupAccumulo(AmqpWebAnalytics.java:87)

At this point, I am stuck. The docs for Accumulo 1.5 indicate that o.a.a.fate.zookeeper.ZooCache does exist. So I wonder if this might be an internal issue in Accumulo.

cmundi commented 11 years ago

Ok, so I finally just listened to the diagnostic. For whatever reason, maven did not copy accumulo-fate-1.5.0.jar to the webanalytics/target/lib. I dropped in the jar I built, and I got past the ZooCache. Now the classloader says I'm missing org/apache/commons/configuration/Configuration. Let's see how deep this rabbet(mq) hole goes...

joshelser commented 11 years ago

Depending on the version of Hadoop used, you may be missing dependencies that Accumulo marked a provided (or that I marked as provided).

commons-collections, commons-configuration, commons-lang and commons-io come to mind.

cmundi commented 11 years ago

Aha! We also need apache-commons and apache-lang. I think I've got it now. I will fork and prepare a branch for consideration.

Thanks!

UPDATE: Ooops. Still not happy with thirft. This one appears to be in flux. I might actually need to figure this maven thing out for real some day.

cmundi commented 11 years ago

Ok, I got it working but it's not pretty. I finally gave up on trying to outsmart maven. I just built libthrift 0.9 from source and dropped the jar in my lib folder. The node-accumulo demo is now working as expected. Yeah!!!

Altogether, accumulo 1.5 API was very simple to accomodate. My problem was raven. I guess I need to break down and buy the anteater book when the new edition comes out next year.

So I ended up with these versions, with corresponding additions further down in the pom:

    <version.accumulo>1.5.0</version.accumulo>
    <version.zookeeper>3.4.5</version.zookeeper>
    <version.gson>2.1</version.gson>
    <version.amqpclient>3.1.3</version.amqpclient>
    <version.hadoop>1.2.0</version.hadoop>
    <version.commonslogging>1.0.4</version.commonslogging>
    <version.log4j>1.2.17</version.log4j>
    <version.thrift>0.9</version.thrift>
    <version.slf4j>1.6.4</version.slf4j>
    <version.commonsio>1.4</version.commonsio>
    <version.commonsconfiguration>1.9</version.commonsconfiguration>
    <version.commonslang>2.6</version.commonslang>

There are three misleading aspects of what appears above. I was not able to get maven to do all my work for me, so I manually copied libthrift-0.9.jar, accumulo-trace-1.5.0.jar, and accumulo-fate-1.5.0.jar into my lib directory. I didn't check which version of the Thrift API is actually needed by Accumulo 1.5, but 0.6.1 does not work and I just built latest and got lucky.

The code changes for the new API are trivial:

+import org.apache.accumulo.core.client.security.tokens.PasswordToken;
.
.
.
+  protected PasswordToken token = null;
.
.
.
-    this.connector = instance.getConnector(this.username, this.password.getBytes());
+    this.token = new PasswordToken(this.password.getBytes());
+    this.connector = instance.getConnector(this.username, this.token);

I would create a pull request if I had better luck with maven. :(