charlesflynn / agiledata

Builds a data science work environment for Russell Jurney's book Agile Data Science.
MIT License
46 stars 27 forks source link

Missing mongo-hadoop jars? #7

Closed mathias closed 10 years ago

mathias commented 10 years ago

While I'm looking at things, and working my way through the Agile Data Science book, it looks like the built VM is missing the mongo-hadoop jars. I logged my salt run to a file to see if I could find a place in the output where something was failing, but I didn't find it.

Thoughts?

Edit: Here's the files that book-code/ch03/pig/mongo.pig is looking for:

mongo-hadoop/mongo-x.x.x.jar
mongo-hadoop-core-x.x.x.jar
mongo-hadoop-pig-x.x.x.jar
charlesflynn commented 10 years ago

I'll need to look when I'm back at my desk, but if I remember it should be in software/mongo-hadoop, installed by an sbt command. You could try a find from vagrant's homdir to see if they're in there somewhere, something similar to find /home/vagrant -iname '*mongo*.jar'

charlesflynn commented 10 years ago

Oh also check the software/lib dir to see if they're linked in there. There's a note about jars and links at https://github.com/charlesflynn/agiledata#registering-jarfiles-in-pig

mathias commented 10 years ago

I should've noted that I did a find for _mongo_jar and only found the mongo java driver, which doesn't seem to be (directly) required here.

charlesflynn commented 10 years ago

Could be that sbt had a problem building them. You might try running the sbt commands manually in the mongo-hadoop dir. Take a look in data.sls to see what the build commands are

mathias commented 10 years ago

Looks like sbt never ran. Thanks! Getting closer to getting it all working :thumbsup:

mathias commented 10 years ago

This helped immensely, btw, but I redid my VM from scratch by first removing the java_install macro and dependencies. Then, I spun up a new (empty) VM image and installed OpenJDK 7, then ran provisioning.

I have read the reasons for the Oracle Java included in the README, but since things seem to work fine on OpenJDK 7, would you consider a branch / alternative install that used OpenJDK 7 in the provision scripts?

charlesflynn commented 10 years ago

Awesome, glad it helped! On the java version, I'll consider anything if it comes with a pull request ;-)

Shouldn't be tough to do since there are yum/apt packages for OpenJDK. Take a look at salt/core/init.sls to see an example conditional for Redhat- or Debian-based distros. But I wouldn't make the changes in that file since it's for non-optional base packages. I'd probably put it around here in salt/agiledata/init.sls where the other Java conditional is processed.