archivesunleashed / aut

The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
https://aut.docs.archivesunleashed.org/
Apache License 2.0
137 stars 33 forks source link

spark-shell --packages "io.archivesunleashed:aut:0.10.0"` fails with not_found dependencies #113

Closed dportabella closed 6 years ago

dportabella commented 6 years ago

testing spark-shell with --packages instead of --jars is a way to verify that the auth library can be imported in a sbt project. I have a sbt project that uses scala and spark, with some other dependencies. I need to add the auth dependency inside the build.sbt. I could include your external fat jar in my project, but this is a poor practise. Moreover, these old dependencies have many issues and conflicts with other dependencies.

this command fails with not_found dependencies: spark-shell --master local[4] --packages "io.archivesunleashed:aut:0.10.0"

also, the issue-111 branch which removes some old repositories, still fails. you can test it as follows:

docker run -it -p 8088:8088 -p 8042:8042 -h sandbox sequenceiq/spark:1.6.0 bash

# git
sudo yum install -y git

# wget
yum install -y wget

# jdk8: https://tecadmin.net/install-java-8-on-centos-rhel-and-fedora/
cd /opt/
wget --no-cookies --no-check-certificate --header "Cookie: gpw_e24=http%3A%2F%2Fwww.oracle.com%2F; oraclelicense=accept-securebackup-cookie" "http://download.oracle.com/otn-pub/java/jdk/8u151-b12/e758a0de34e24606bca991d704f6dcbf/jdk-8u151-linux-x64.tar.gz"
tar xzf jdk-8u151-linux-x64.tar.gz
cd /opt/jdk1.8.0_151/
alternatives --install /usr/bin/java java /opt/jdk1.8.0_151/bin/java 2
alternatives --set java /opt/jdk1.8.0_151/bin/java
alternatives --install /usr/bin/jar jar /opt/jdk1.8.0_151/bin/jar 2
alternatives --set jar /opt/jdk1.8.0_151/bin/jar
alternatives --install /usr/bin/javac javac /opt/jdk1.8.0_151/bin/javac 2
alternatives --set javac /opt/jdk1.8.0_151/bin/javac
export JAVA_HOME=/opt/jdk1.8.0_151
export JRE_HOME=/opt/jdk1.8.0_151/jre
export PATH=$PATH:/opt/jdk1.8.0_151/bin:/opt/jdk1.8.0_151/jre/bin

# maven
cd ~
curl http://www-eu.apache.org/dist/maven/maven-3/3.5.2/binaries/apache-maven-3.5.2-bin.tar.gz | tar -xvzf -

git clone https://github.com/archivesunleashed/aut.git
cd aut
git checkout issue-111
../apache-maven-3.5.2/bin/mvn install -DskipTests

spark-shell --master local[4] --packages "io.archivesunleashed:aut:0.10.1-SNAPSHOT"
    found io.archivesunleashed#aut;0.10.1-SNAPSHOT in local-m2-cache
:: problems summary ::
:::: WARNINGS
        [NOT FOUND  ] commons-net#commons-net;1.4.1!commons-net.jar (2ms)
        [NOT FOUND  ] org.codehaus.jackson#jackson-core-asl;1.5.2!jackson-core-asl.jar (1ms)
        [NOT FOUND  ] org.codehaus.jackson#jackson-mapper-asl;1.5.2!jackson-mapper-asl.jar (0ms)
        [NOT FOUND  ] net.java.dev.jets3t#jets3t;0.6.1!jets3t.jar (0ms)
ruebot commented 6 years ago

@dportabella can you try out the issue-113-a branch?

I updated pretty much everything dependency-wise that could be updated in our pom.xml, and added additional dependencies that --packages needed to build. (I wrapped those in comments.) It works in both Spark 2.1.1 and 1.6.0, which I noticed you were using. Output is below.

N.B. this will need some extensive testing since we updated everything, and there are some deprecation warnings we'll need to address. I'd be happier to add some additional dependencies with the existing versions as is, and come back around to updating dependency versions later. But happy to be swayed.

@lintool let me know if you see anything ridiculous here.

[nruest@roo:bin]$ ./spark-shell --verbose --packages "io.archivesunleashed:aut:0.10.1-SNAPSHOT"           
Using properties file: null
Parsed arguments:
  master                  local[*]
  deployMode              null
  executorMemory          null
  executorCores           null
  totalExecutorCores      null
  propertiesFile          null
  driverMemory            null
  driverCores             null
  driverExtraClassPath    null
  driverExtraLibraryPath  null
  driverExtraJavaOptions  null
  supervise               false
  queue                   null
  numExecutors            null
  files                   null
  pyFiles                 null
  archives                null
  mainClass               org.apache.spark.repl.Main
  primaryResource         spark-shell
  name                    Spark shell
  childArgs               []
  jars                    null
  packages                io.archivesunleashed:aut:0.10.1-SNAPSHOT
  packagesExclusions      null
  repositories            null
  verbose                 true

Spark properties used, including those specified through
 --conf and those from the properties file null:

Ivy Default Cache set to: /home/nruest/.ivy2/cache
The jars for the packages stored in: /home/nruest/.ivy2/jars
:: loading settings :: url = jar:file:/home/nruest/Downloads/spark-1.6.0-bin-hadoop2.6/lib/spark-assembly-1.6.0-hadoop2.6.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
io.archivesunleashed#aut added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
    confs: [default]
    found io.archivesunleashed#aut;0.10.1-SNAPSHOT in local-m2-cache
    found org.scala-lang#scala-parser-combinators;2.11.0-M4 in local-m2-cache
    found com.chuusai#shapeless_2.11;2.2.5 in local-m2-cache
    found com.google.guava#guava;14.0.1 in local-m2-cache
    found org.xerial.snappy#snappy-java;1.0.5 in local-m2-cache
    found org.jsoup#jsoup;1.7.3 in local-m2-cache
    found org.netpreserve.commons#webarchive-commons;1.1.4 in local-m2-cache
    found org.json#json;20131018 in local-m2-cache
    found org.htmlparser#htmlparser;1.6 in local-m2-cache
    found com.googlecode.juniversalchardet#juniversalchardet;1.0.3 in local-m2-cache
    found commons-httpclient#commons-httpclient;3.1 in local-m2-cache
    found commons-logging#commons-logging;1.0.4 in local-m2-cache
    found commons-codec#commons-codec;1.2 in local-m2-cache
    found org.apache.hadoop#hadoop-core;0.20.2-cdh3u4 in local-m2-cache
    found com.cloudera.cdh#hadoop-ant;0.20.2-cdh3u4 in local-m2-cache
    found commons-cli#commons-cli;1.2 in local-m2-cache
    found xmlenc#xmlenc;0.52 in local-m2-cache
    found org.apache.hadoop.thirdparty.guava#guava;r09-jarjar in local-m2-cache
    found commons-codec#commons-codec;1.4 in local-m2-cache
    found commons-net#commons-net;1.4.1 in local-m2-cache
    found oro#oro;2.0.8 in local-m2-cache
    found org.codehaus.jackson#jackson-core-asl;1.5.2 in local-m2-cache
    found org.codehaus.jackson#jackson-mapper-asl;1.5.2 in local-m2-cache
    found commons-el#commons-el;1.0 in local-m2-cache
    found net.java.dev.jets3t#jets3t;0.6.1 in local-m2-cache
    found org.eclipse.jdt#core;3.1.1 in local-m2-cache
    found commons-lang#commons-lang;2.5 in local-m2-cache
    found commons-io#commons-io;2.4 in local-m2-cache
    found org.gnu.inet#libidn;1.15 in local-m2-cache
    found it.unimi.dsi#dsiutils;2.0.12 in local-m2-cache
    found it.unimi.dsi#fastutil;6.5.2 in local-m2-cache
    found com.martiansoftware#jsap;2.1 in local-m2-cache
    found org.slf4j#slf4j-api;1.7.2 in local-m2-cache
    found log4j#log4j;1.2.17 in local-m2-cache
    found commons-configuration#commons-configuration;1.8 in local-m2-cache
    found commons-logging#commons-logging;1.1.1 in local-m2-cache
    found commons-collections#commons-collections;3.2.1 in local-m2-cache
    found org.apache.commons#commons-math3;3.1.1 in local-m2-cache
    found org.apache.httpcomponents#httpcore;4.3 in local-m2-cache
    found joda-time#joda-time;1.6 in local-m2-cache
    found edu.stanford.nlp#stanford-corenlp;3.4.1 in local-m2-cache
    found com.io7m.xom#xom;1.2.10 in local-m2-cache
    found xml-apis#xml-apis;1.3.03 in local-m2-cache
    found xalan#xalan;2.7.0 in local-m2-cache
    found joda-time#joda-time;2.1 in local-m2-cache
    found de.jollyday#jollyday;0.4.7 in local-m2-cache
    found javax.xml.bind#jaxb-api;2.2.7 in local-m2-cache
    found com.googlecode.efficient-java-matrix-library#ejml;0.23 in local-m2-cache
    found javax.json#javax.json-api;1.0 in local-m2-cache
    found org.apache.tika#tika-core;1.9 in local-m2-cache
    found org.apache.tika#tika-parsers;1.9 in local-m2-cache
    found org.gagravarr#vorbis-java-tika;0.6 in local-m2-cache
    found net.sourceforge.jmatio#jmatio;1.0 in local-m2-cache
    found org.apache.james#apache-mime4j-core;0.7.2 in local-m2-cache
    found org.apache.james#apache-mime4j-dom;0.7.2 in local-m2-cache
    found org.apache.commons#commons-compress;1.9 in local-m2-cache
    found org.tukaani#xz;1.5 in local-m2-cache
    found commons-codec#commons-codec;1.9 in local-m2-cache
    found org.apache.pdfbox#pdfbox;1.8.9 in local-m2-cache
    found org.apache.pdfbox#fontbox;1.8.9 in local-m2-cache
    found org.apache.pdfbox#jempbox;1.8.9 in local-m2-cache
    found org.bouncycastle#bcmail-jdk15on;1.52 in local-m2-cache
    found org.bouncycastle#bcprov-jdk15on;1.52 in local-m2-cache
    found org.bouncycastle#bcpkix-jdk15on;1.52 in local-m2-cache
    found org.apache.poi#poi;3.12 in local-m2-cache
    found org.apache.poi#poi-scratchpad;3.12 in local-m2-cache
    found org.apache.poi#poi-ooxml;3.12 in local-m2-cache
    found org.apache.poi#poi-ooxml-schemas;3.12 in local-m2-cache
    found org.apache.xmlbeans#xmlbeans;2.6.0 in local-m2-cache
    found org.ccil.cowan.tagsoup#tagsoup;1.2.1 in local-m2-cache
    found org.ow2.asm#asm-debug-all;4.1 in local-m2-cache
    found com.googlecode.mp4parser#isoparser;1.0.2 in local-m2-cache
    found org.aspectj#aspectjrt;1.8.0 in local-m2-cache
    found com.drewnoakes#metadata-extractor;2.8.0 in local-m2-cache
    found com.adobe.xmp#xmpcore;5.1.2 in local-m2-cache
    found de.l3s.boilerpipe#boilerpipe;1.1.0 in local-m2-cache
    found rome#rome;1.0 in local-m2-cache
    found jdom#jdom;1.0 in local-m2-cache
    found org.gagravarr#vorbis-java-core;0.6 in local-m2-cache
    found org.codelibs#jhighlight;1.0.2 in local-m2-cache
    found com.pff#java-libpst;0.8.1 in local-m2-cache
    found com.github.junrar#junrar;0.7 in local-m2-cache
    found commons-logging#commons-logging-api;1.1 in local-m2-cache
    found org.apache.commons#commons-vfs2;2.0 in local-m2-cache
    found org.apache.maven.scm#maven-scm-api;1.4 in local-m2-cache
    found org.codehaus.plexus#plexus-utils;1.5.6 in local-m2-cache
    found org.apache.maven.scm#maven-scm-provider-svnexe;1.4 in local-m2-cache
    found org.apache.maven.scm#maven-scm-provider-svn-commons;1.4 in local-m2-cache
    found regexp#regexp;1.3 in local-m2-cache
    found org.apache.opennlp#opennlp-tools;1.5.3 in local-m2-cache
    found org.apache.opennlp#opennlp-maxent;3.0.3 in local-m2-cache
    found net.sf.jwordnet#jwnl;1.3.3 in local-m2-cache
    found org.apache.commons#commons-exec;1.3 in local-m2-cache
    found com.googlecode.json-simple#json-simple;1.1.1 in local-m2-cache
    found junit#junit;4.11 in central
    found org.hamcrest#hamcrest-core;1.3 in local-m2-cache
    found edu.ucar#netcdf4;4.5.5 in local-m2-cache
    found net.jcip#jcip-annotations;1.0 in local-m2-cache
    found net.java.dev.jna#jna;4.1.0 in local-m2-cache
    found org.slf4j#slf4j-api;1.7.12 in central
    found edu.ucar#grib;4.5.5 in local-m2-cache
    found com.google.protobuf#protobuf-java;2.5.0 in local-m2-cache
    found org.jdom#jdom2;2.0.4 in local-m2-cache
    found edu.ucar#jj2000;5.2 in local-m2-cache
    found org.itadaki#bzip2;0.9.1 in local-m2-cache
    found edu.ucar#cdm;4.5.5 in local-m2-cache
    found edu.ucar#udunits;4.5.5 in local-m2-cache
    found joda-time#joda-time;2.2 in local-m2-cache
    found edu.ucar#httpservices;4.5.5 in local-m2-cache
    found org.apache.httpcomponents#httpclient;4.2.6 in local-m2-cache
    found org.apache.httpcomponents#httpmime;4.2.6 in local-m2-cache
    found org.quartz-scheduler#quartz;2.2.0 in local-m2-cache
    found c3p0#c3p0;0.9.1.1 in local-m2-cache
    found net.sf.ehcache#ehcache-core;2.6.2 in local-m2-cache
    found com.beust#jcommander;1.35 in local-m2-cache
    found org.apache.commons#commons-csv;1.0 in local-m2-cache
    found org.apache.sis.core#sis-utility;0.5 in local-m2-cache
    found org.opengis#geoapi;3.0.0 in local-m2-cache
    found javax.measure#jsr-275;0.9.3 in local-m2-cache
    found org.apache.sis.storage#sis-netcdf;0.5 in local-m2-cache
    found org.apache.sis.storage#sis-storage;0.5 in local-m2-cache
    found org.apache.sis.core#sis-metadata;0.5 in local-m2-cache
    found org.apache.sis.core#sis-referencing;0.5 in local-m2-cache
    found com.syncthemall#boilerpipe;1.2.2 in local-m2-cache
    found net.sourceforge.nekohtml#nekohtml;1.9.20 in local-m2-cache
    found xerces#xercesImpl;2.11.0 in local-m2-cache
    found xml-apis#xml-apis;1.4.01 in local-m2-cache
    found tl.lin#lintools-datatypes;1.0.0 in local-m2-cache
    found com.google.code.gson#gson;2.3.1 in local-m2-cache
:: resolution report :: resolve 2305ms :: artifacts dl 34ms
    :: modules in use:
    c3p0#c3p0;0.9.1.1 from local-m2-cache in [default]
    com.adobe.xmp#xmpcore;5.1.2 from local-m2-cache in [default]
    com.beust#jcommander;1.35 from local-m2-cache in [default]
    com.chuusai#shapeless_2.11;2.2.5 from local-m2-cache in [default]
    com.cloudera.cdh#hadoop-ant;0.20.2-cdh3u4 from local-m2-cache in [default]
    com.drewnoakes#metadata-extractor;2.8.0 from local-m2-cache in [default]
    com.github.junrar#junrar;0.7 from local-m2-cache in [default]
    com.google.code.gson#gson;2.3.1 from local-m2-cache in [default]
    com.google.guava#guava;14.0.1 from local-m2-cache in [default]
    com.google.protobuf#protobuf-java;2.5.0 from local-m2-cache in [default]
    com.googlecode.efficient-java-matrix-library#ejml;0.23 from local-m2-cache in [default]
    com.googlecode.json-simple#json-simple;1.1.1 from local-m2-cache in [default]
    com.googlecode.juniversalchardet#juniversalchardet;1.0.3 from local-m2-cache in [default]
    com.googlecode.mp4parser#isoparser;1.0.2 from local-m2-cache in [default]
    com.io7m.xom#xom;1.2.10 from local-m2-cache in [default]
    com.martiansoftware#jsap;2.1 from local-m2-cache in [default]
    com.pff#java-libpst;0.8.1 from local-m2-cache in [default]
    com.syncthemall#boilerpipe;1.2.2 from local-m2-cache in [default]
    commons-cli#commons-cli;1.2 from local-m2-cache in [default]
    commons-codec#commons-codec;1.9 from local-m2-cache in [default]
    commons-collections#commons-collections;3.2.1 from local-m2-cache in [default]
    commons-configuration#commons-configuration;1.8 from local-m2-cache in [default]
    commons-el#commons-el;1.0 from local-m2-cache in [default]
    commons-httpclient#commons-httpclient;3.1 from local-m2-cache in [default]
    commons-io#commons-io;2.4 from local-m2-cache in [default]
    commons-lang#commons-lang;2.5 from local-m2-cache in [default]
    commons-logging#commons-logging;1.1.1 from local-m2-cache in [default]
    commons-logging#commons-logging-api;1.1 from local-m2-cache in [default]
    commons-net#commons-net;1.4.1 from local-m2-cache in [default]
    de.jollyday#jollyday;0.4.7 from local-m2-cache in [default]
    de.l3s.boilerpipe#boilerpipe;1.1.0 from local-m2-cache in [default]
    edu.stanford.nlp#stanford-corenlp;3.4.1 from local-m2-cache in [default]
    edu.ucar#cdm;4.5.5 from local-m2-cache in [default]
    edu.ucar#grib;4.5.5 from local-m2-cache in [default]
    edu.ucar#httpservices;4.5.5 from local-m2-cache in [default]
    edu.ucar#jj2000;5.2 from local-m2-cache in [default]
    edu.ucar#netcdf4;4.5.5 from local-m2-cache in [default]
    edu.ucar#udunits;4.5.5 from local-m2-cache in [default]
    io.archivesunleashed#aut;0.10.1-SNAPSHOT from local-m2-cache in [default]
    it.unimi.dsi#dsiutils;2.0.12 from local-m2-cache in [default]
    it.unimi.dsi#fastutil;6.5.2 from local-m2-cache in [default]
    javax.json#javax.json-api;1.0 from local-m2-cache in [default]
    javax.measure#jsr-275;0.9.3 from local-m2-cache in [default]
    javax.xml.bind#jaxb-api;2.2.7 from local-m2-cache in [default]
    jdom#jdom;1.0 from local-m2-cache in [default]
    joda-time#joda-time;2.2 from local-m2-cache in [default]
    junit#junit;4.11 from central in [default]
    log4j#log4j;1.2.17 from local-m2-cache in [default]
    net.java.dev.jets3t#jets3t;0.6.1 from local-m2-cache in [default]
    net.java.dev.jna#jna;4.1.0 from local-m2-cache in [default]
    net.jcip#jcip-annotations;1.0 from local-m2-cache in [default]
    net.sf.ehcache#ehcache-core;2.6.2 from local-m2-cache in [default]
    net.sf.jwordnet#jwnl;1.3.3 from local-m2-cache in [default]
    net.sourceforge.jmatio#jmatio;1.0 from local-m2-cache in [default]
    net.sourceforge.nekohtml#nekohtml;1.9.20 from local-m2-cache in [default]
    org.apache.commons#commons-compress;1.9 from local-m2-cache in [default]
    org.apache.commons#commons-csv;1.0 from local-m2-cache in [default]
    org.apache.commons#commons-exec;1.3 from local-m2-cache in [default]
    org.apache.commons#commons-math3;3.1.1 from local-m2-cache in [default]
    org.apache.commons#commons-vfs2;2.0 from local-m2-cache in [default]
    org.apache.hadoop#hadoop-core;0.20.2-cdh3u4 from local-m2-cache in [default]
    org.apache.hadoop.thirdparty.guava#guava;r09-jarjar from local-m2-cache in [default]
    org.apache.httpcomponents#httpclient;4.2.6 from local-m2-cache in [default]
    org.apache.httpcomponents#httpcore;4.3 from local-m2-cache in [default]
    org.apache.httpcomponents#httpmime;4.2.6 from local-m2-cache in [default]
    org.apache.james#apache-mime4j-core;0.7.2 from local-m2-cache in [default]
    org.apache.james#apache-mime4j-dom;0.7.2 from local-m2-cache in [default]
    org.apache.maven.scm#maven-scm-api;1.4 from local-m2-cache in [default]
    org.apache.maven.scm#maven-scm-provider-svn-commons;1.4 from local-m2-cache in [default]
    org.apache.maven.scm#maven-scm-provider-svnexe;1.4 from local-m2-cache in [default]
    org.apache.opennlp#opennlp-maxent;3.0.3 from local-m2-cache in [default]
    org.apache.opennlp#opennlp-tools;1.5.3 from local-m2-cache in [default]
    org.apache.pdfbox#fontbox;1.8.9 from local-m2-cache in [default]
    org.apache.pdfbox#jempbox;1.8.9 from local-m2-cache in [default]
    org.apache.pdfbox#pdfbox;1.8.9 from local-m2-cache in [default]
    org.apache.poi#poi;3.12 from local-m2-cache in [default]
    org.apache.poi#poi-ooxml;3.12 from local-m2-cache in [default]
    org.apache.poi#poi-ooxml-schemas;3.12 from local-m2-cache in [default]
    org.apache.poi#poi-scratchpad;3.12 from local-m2-cache in [default]
    org.apache.sis.core#sis-metadata;0.5 from local-m2-cache in [default]
    org.apache.sis.core#sis-referencing;0.5 from local-m2-cache in [default]
    org.apache.sis.core#sis-utility;0.5 from local-m2-cache in [default]
    org.apache.sis.storage#sis-netcdf;0.5 from local-m2-cache in [default]
    org.apache.sis.storage#sis-storage;0.5 from local-m2-cache in [default]
    org.apache.tika#tika-core;1.9 from local-m2-cache in [default]
    org.apache.tika#tika-parsers;1.9 from local-m2-cache in [default]
    org.apache.xmlbeans#xmlbeans;2.6.0 from local-m2-cache in [default]
    org.aspectj#aspectjrt;1.8.0 from local-m2-cache in [default]
    org.bouncycastle#bcmail-jdk15on;1.52 from local-m2-cache in [default]
    org.bouncycastle#bcpkix-jdk15on;1.52 from local-m2-cache in [default]
    org.bouncycastle#bcprov-jdk15on;1.52 from local-m2-cache in [default]
    org.ccil.cowan.tagsoup#tagsoup;1.2.1 from local-m2-cache in [default]
    org.codehaus.jackson#jackson-core-asl;1.5.2 from local-m2-cache in [default]
    org.codehaus.jackson#jackson-mapper-asl;1.5.2 from local-m2-cache in [default]
    org.codehaus.plexus#plexus-utils;1.5.6 from local-m2-cache in [default]
    org.codelibs#jhighlight;1.0.2 from local-m2-cache in [default]
    org.eclipse.jdt#core;3.1.1 from local-m2-cache in [default]
    org.gagravarr#vorbis-java-core;0.6 from local-m2-cache in [default]
    org.gagravarr#vorbis-java-tika;0.6 from local-m2-cache in [default]
    org.gnu.inet#libidn;1.15 from local-m2-cache in [default]
    org.hamcrest#hamcrest-core;1.3 from local-m2-cache in [default]
    org.htmlparser#htmlparser;1.6 from local-m2-cache in [default]
    org.itadaki#bzip2;0.9.1 from local-m2-cache in [default]
    org.jdom#jdom2;2.0.4 from local-m2-cache in [default]
    org.json#json;20131018 from local-m2-cache in [default]
    org.jsoup#jsoup;1.7.3 from local-m2-cache in [default]
    org.netpreserve.commons#webarchive-commons;1.1.4 from local-m2-cache in [default]
    org.opengis#geoapi;3.0.0 from local-m2-cache in [default]
    org.ow2.asm#asm-debug-all;4.1 from local-m2-cache in [default]
    org.quartz-scheduler#quartz;2.2.0 from local-m2-cache in [default]
    org.scala-lang#scala-parser-combinators;2.11.0-M4 from local-m2-cache in [default]
    org.slf4j#slf4j-api;1.7.12 from central in [default]
    org.tukaani#xz;1.5 from local-m2-cache in [default]
    org.xerial.snappy#snappy-java;1.0.5 from local-m2-cache in [default]
    oro#oro;2.0.8 from local-m2-cache in [default]
    regexp#regexp;1.3 from local-m2-cache in [default]
    rome#rome;1.0 from local-m2-cache in [default]
    tl.lin#lintools-datatypes;1.0.0 from local-m2-cache in [default]
    xalan#xalan;2.7.0 from local-m2-cache in [default]
    xerces#xercesImpl;2.11.0 from local-m2-cache in [default]
    xml-apis#xml-apis;1.4.01 from local-m2-cache in [default]
    xmlenc#xmlenc;0.52 from local-m2-cache in [default]
    :: evicted modules:
    com.google.guava#guava;17.0 by [com.google.guava#guava;14.0.1] in [default]
    commons-logging#commons-logging;1.0.4 by [commons-logging#commons-logging;1.1.1] in [default]
    commons-codec#commons-codec;1.2 by [commons-codec#commons-codec;1.4] in [default]
    commons-codec#commons-codec;1.4 by [commons-codec#commons-codec;1.9] in [default]
    com.google.guava#guava;14.0-rc2 by [com.google.guava#guava;14.0.1] in [default]
    org.slf4j#slf4j-api;1.7.2 by [org.slf4j#slf4j-api;1.7.12] in [default]
    commons-lang#commons-lang;2.6 by [commons-lang#commons-lang;2.5] in [default]
    joda-time#joda-time;1.6 by [joda-time#joda-time;2.1] in [default]
    xml-apis#xml-apis;1.3.03 by [xml-apis#xml-apis;1.4.01] in [default]
    xerces#xercesImpl;2.8.0 by [xerces#xercesImpl;2.11.0] in [default]
    xml-apis#xml-apis;2.0.2 by [xml-apis#xml-apis;1.3.03] in [default]
    joda-time#joda-time;2.1 by [joda-time#joda-time;2.2] in [default]
    org.apache.tika#tika-core;1.5 by [org.apache.tika#tika-core;1.9] in [default]
    org.jsoup#jsoup;1.7.2 by [org.jsoup#jsoup;1.7.3] in [default]
    org.apache.httpcomponents#httpcore;4.2.5 by [org.apache.httpcomponents#httpcore;4.3] in [default]
    commons-codec#commons-codec;1.6 by [commons-codec#commons-codec;1.9] in [default]
    com.google.guava#guava;11.0.2 by [com.google.guava#guava;14.0.1] in [default]
    xerces#xercesImpl;2.10.0 by [xerces#xercesImpl;2.11.0] in [default]
    com.google.guava#guava;18.0 by [com.google.guava#guava;14.0.1] in [default]
    ---------------------------------------------------------------------
    |                  |            modules            ||   artifacts   |
    |       conf       | number| search|dwnlded|evicted|| number|dwnlded|
    ---------------------------------------------------------------------
    |      default     |  141  |   0   |   0   |   19  ||  122  |   0   |
    ---------------------------------------------------------------------
:: retrieving :: org.apache.spark#spark-submit-parent
    confs: [default]
    0 artifacts copied, 122 already retrieved (0kB/44ms)
Main class:
org.apache.spark.repl.Main
Arguments:

System properties:
SPARK_SUBMIT -> true
spark.app.name -> Spark shell
spark.jars -> file:/home/nruest/.ivy2/jars/io.archivesunleashed_aut-0.10.1-SNAPSHOT.jar,file:/home/nruest/.ivy2/jars/org.scala-lang_scala-parser-combinators-2.11.0-M4.jar,file:/home/nruest/.ivy2/jars/com.chuusai_shapeless_2.11-2.2.5.jar,file:/home/nruest/.ivy2/jars/com.google.guava_guava-14.0.1.jar,file:/home/nruest/.ivy2/jars/org.xerial.snappy_snappy-java-1.0.5.jar,file:/home/nruest/.ivy2/jars/org.jsoup_jsoup-1.7.3.jar,file:/home/nruest/.ivy2/jars/org.netpreserve.commons_webarchive-commons-1.1.4.jar,file:/home/nruest/.ivy2/jars/edu.stanford.nlp_stanford-corenlp-3.4.1.jar,file:/home/nruest/.ivy2/jars/org.apache.tika_tika-core-1.9.jar,file:/home/nruest/.ivy2/jars/org.apache.tika_tika-parsers-1.9.jar,file:/home/nruest/.ivy2/jars/com.syncthemall_boilerpipe-1.2.2.jar,file:/home/nruest/.ivy2/jars/xerces_xercesImpl-2.11.0.jar,file:/home/nruest/.ivy2/jars/tl.lin_lintools-datatypes-1.0.0.jar,file:/home/nruest/.ivy2/jars/org.json_json-20131018.jar,file:/home/nruest/.ivy2/jars/org.htmlparser_htmlparser-1.6.jar,file:/home/nruest/.ivy2/jars/com.googlecode.juniversalchardet_juniversalchardet-1.0.3.jar,file:/home/nruest/.ivy2/jars/commons-httpclient_commons-httpclient-3.1.jar,file:/home/nruest/.ivy2/jars/org.apache.hadoop_hadoop-core-0.20.2-cdh3u4.jar,file:/home/nruest/.ivy2/jars/commons-lang_commons-lang-2.5.jar,file:/home/nruest/.ivy2/jars/commons-io_commons-io-2.4.jar,file:/home/nruest/.ivy2/jars/org.gnu.inet_libidn-1.15.jar,file:/home/nruest/.ivy2/jars/it.unimi.dsi_dsiutils-2.0.12.jar,file:/home/nruest/.ivy2/jars/org.apache.httpcomponents_httpcore-4.3.jar,file:/home/nruest/.ivy2/jars/com.cloudera.cdh_hadoop-ant-0.20.2-cdh3u4.jar,file:/home/nruest/.ivy2/jars/commons-cli_commons-cli-1.2.jar,file:/home/nruest/.ivy2/jars/xmlenc_xmlenc-0.52.jar,file:/home/nruest/.ivy2/jars/org.apache.hadoop.thirdparty.guava_guava-r09-jarjar.jar,file:/home/nruest/.ivy2/jars/commons-net_commons-net-1.4.1.jar,file:/home/nruest/.ivy2/jars/org.codehaus.jackson_jackson-core-asl-1.5.2.jar,file:/home/nruest/.ivy2/jars/org.codehaus.jackson_jackson-mapper-asl-1.5.2.jar,file:/home/nruest/.ivy2/jars/commons-el_commons-el-1.0.jar,file:/home/nruest/.ivy2/jars/net.java.dev.jets3t_jets3t-0.6.1.jar,file:/home/nruest/.ivy2/jars/oro_oro-2.0.8.jar,file:/home/nruest/.ivy2/jars/org.eclipse.jdt_core-3.1.1.jar,file:/home/nruest/.ivy2/jars/it.unimi.dsi_fastutil-6.5.2.jar,file:/home/nruest/.ivy2/jars/com.martiansoftware_jsap-2.1.jar,file:/home/nruest/.ivy2/jars/log4j_log4j-1.2.17.jar,file:/home/nruest/.ivy2/jars/commons-configuration_commons-configuration-1.8.jar,file:/home/nruest/.ivy2/jars/commons-collections_commons-collections-3.2.1.jar,file:/home/nruest/.ivy2/jars/org.apache.commons_commons-math3-3.1.1.jar,file:/home/nruest/.ivy2/jars/commons-logging_commons-logging-1.1.1.jar,file:/home/nruest/.ivy2/jars/com.io7m.xom_xom-1.2.10.jar,file:/home/nruest/.ivy2/jars/de.jollyday_jollyday-0.4.7.jar,file:/home/nruest/.ivy2/jars/com.googlecode.efficient-java-matrix-library_ejml-0.23.jar,file:/home/nruest/.ivy2/jars/javax.json_javax.json-api-1.0.jar,file:/home/nruest/.ivy2/jars/xalan_xalan-2.7.0.jar,file:/home/nruest/.ivy2/jars/javax.xml.bind_jaxb-api-2.2.7.jar,file:/home/nruest/.ivy2/jars/org.gagravarr_vorbis-java-tika-0.6.jar,file:/home/nruest/.ivy2/jars/net.sourceforge.jmatio_jmatio-1.0.jar,file:/home/nruest/.ivy2/jars/org.apache.james_apache-mime4j-core-0.7.2.jar,file:/home/nruest/.ivy2/jars/org.apache.james_apache-mime4j-dom-0.7.2.jar,file:/home/nruest/.ivy2/jars/org.apache.commons_commons-compress-1.9.jar,file:/home/nruest/.ivy2/jars/org.tukaani_xz-1.5.jar,file:/home/nruest/.ivy2/jars/commons-codec_commons-codec-1.9.jar,file:/home/nruest/.ivy2/jars/org.apache.pdfbox_pdfbox-1.8.9.jar,file:/home/nruest/.ivy2/jars/org.bouncycastle_bcmail-jdk15on-1.52.jar,file:/home/nruest/.ivy2/jars/org.bouncycastle_bcprov-jdk15on-1.52.jar,file:/home/nruest/.ivy2/jars/org.apache.poi_poi-3.12.jar,file:/home/nruest/.ivy2/jars/org.apache.poi_poi-scratchpad-3.12.jar,file:/home/nruest/.ivy2/jars/org.apache.poi_poi-ooxml-3.12.jar,file:/home/nruest/.ivy2/jars/org.ccil.cowan.tagsoup_tagsoup-1.2.1.jar,file:/home/nruest/.ivy2/jars/org.ow2.asm_asm-debug-all-4.1.jar,file:/home/nruest/.ivy2/jars/com.googlecode.mp4parser_isoparser-1.0.2.jar,file:/home/nruest/.ivy2/jars/com.drewnoakes_metadata-extractor-2.8.0.jar,file:/home/nruest/.ivy2/jars/de.l3s.boilerpipe_boilerpipe-1.1.0.jar,file:/home/nruest/.ivy2/jars/rome_rome-1.0.jar,file:/home/nruest/.ivy2/jars/org.gagravarr_vorbis-java-core-0.6.jar,file:/home/nruest/.ivy2/jars/org.codelibs_jhighlight-1.0.2.jar,file:/home/nruest/.ivy2/jars/com.pff_java-libpst-0.8.1.jar,file:/home/nruest/.ivy2/jars/com.github.junrar_junrar-0.7.jar,file:/home/nruest/.ivy2/jars/org.apache.opennlp_opennlp-tools-1.5.3.jar,file:/home/nruest/.ivy2/jars/org.apache.commons_commons-exec-1.3.jar,file:/home/nruest/.ivy2/jars/com.googlecode.json-simple_json-simple-1.1.1.jar,file:/home/nruest/.ivy2/jars/edu.ucar_netcdf4-4.5.5.jar,file:/home/nruest/.ivy2/jars/edu.ucar_grib-4.5.5.jar,file:/home/nruest/.ivy2/jars/edu.ucar_cdm-4.5.5.jar,file:/home/nruest/.ivy2/jars/edu.ucar_httpservices-4.5.5.jar,file:/home/nruest/.ivy2/jars/org.apache.commons_commons-csv-1.0.jar,file:/home/nruest/.ivy2/jars/org.apache.sis.core_sis-utility-0.5.jar,file:/home/nruest/.ivy2/jars/org.apache.sis.storage_sis-netcdf-0.5.jar,file:/home/nruest/.ivy2/jars/org.apache.sis.core_sis-metadata-0.5.jar,file:/home/nruest/.ivy2/jars/org.opengis_geoapi-3.0.0.jar,file:/home/nruest/.ivy2/jars/org.apache.pdfbox_fontbox-1.8.9.jar,file:/home/nruest/.ivy2/jars/org.apache.pdfbox_jempbox-1.8.9.jar,file:/home/nruest/.ivy2/jars/org.bouncycastle_bcpkix-jdk15on-1.52.jar,file:/home/nruest/.ivy2/jars/org.apache.poi_poi-ooxml-schemas-3.12.jar,file:/home/nruest/.ivy2/jars/org.apache.xmlbeans_xmlbeans-2.6.0.jar,file:/home/nruest/.ivy2/jars/org.aspectj_aspectjrt-1.8.0.jar,file:/home/nruest/.ivy2/jars/com.adobe.xmp_xmpcore-5.1.2.jar,file:/home/nruest/.ivy2/jars/jdom_jdom-1.0.jar,file:/home/nruest/.ivy2/jars/commons-logging_commons-logging-api-1.1.jar,file:/home/nruest/.ivy2/jars/org.apache.commons_commons-vfs2-2.0.jar,file:/home/nruest/.ivy2/jars/org.apache.maven.scm_maven-scm-api-1.4.jar,file:/home/nruest/.ivy2/jars/org.apache.maven.scm_maven-scm-provider-svnexe-1.4.jar,file:/home/nruest/.ivy2/jars/org.codehaus.plexus_plexus-utils-1.5.6.jar,file:/home/nruest/.ivy2/jars/org.apache.maven.scm_maven-scm-provider-svn-commons-1.4.jar,file:/home/nruest/.ivy2/jars/regexp_regexp-1.3.jar,file:/home/nruest/.ivy2/jars/org.apache.opennlp_opennlp-maxent-3.0.3.jar,file:/home/nruest/.ivy2/jars/net.sf.jwordnet_jwnl-1.3.3.jar,file:/home/nruest/.ivy2/jars/junit_junit-4.11.jar,file:/home/nruest/.ivy2/jars/org.hamcrest_hamcrest-core-1.3.jar,file:/home/nruest/.ivy2/jars/net.jcip_jcip-annotations-1.0.jar,file:/home/nruest/.ivy2/jars/net.java.dev.jna_jna-4.1.0.jar,file:/home/nruest/.ivy2/jars/org.slf4j_slf4j-api-1.7.12.jar,file:/home/nruest/.ivy2/jars/com.google.protobuf_protobuf-java-2.5.0.jar,file:/home/nruest/.ivy2/jars/org.jdom_jdom2-2.0.4.jar,file:/home/nruest/.ivy2/jars/edu.ucar_jj2000-5.2.jar,file:/home/nruest/.ivy2/jars/org.itadaki_bzip2-0.9.1.jar,file:/home/nruest/.ivy2/jars/edu.ucar_udunits-4.5.5.jar,file:/home/nruest/.ivy2/jars/joda-time_joda-time-2.2.jar,file:/home/nruest/.ivy2/jars/org.quartz-scheduler_quartz-2.2.0.jar,file:/home/nruest/.ivy2/jars/net.sf.ehcache_ehcache-core-2.6.2.jar,file:/home/nruest/.ivy2/jars/com.beust_jcommander-1.35.jar,file:/home/nruest/.ivy2/jars/org.apache.httpcomponents_httpclient-4.2.6.jar,file:/home/nruest/.ivy2/jars/org.apache.httpcomponents_httpmime-4.2.6.jar,file:/home/nruest/.ivy2/jars/c3p0_c3p0-0.9.1.1.jar,file:/home/nruest/.ivy2/jars/javax.measure_jsr-275-0.9.3.jar,file:/home/nruest/.ivy2/jars/org.apache.sis.storage_sis-storage-0.5.jar,file:/home/nruest/.ivy2/jars/org.apache.sis.core_sis-referencing-0.5.jar,file:/home/nruest/.ivy2/jars/net.sourceforge.nekohtml_nekohtml-1.9.20.jar,file:/home/nruest/.ivy2/jars/xml-apis_xml-apis-1.4.01.jar,file:/home/nruest/.ivy2/jars/com.google.code.gson_gson-2.3.1.jar
spark.submit.deployMode -> client
spark.master -> local[*]
Classpath elements:
/home/nruest/.ivy2/jars/io.archivesunleashed_aut-0.10.1-SNAPSHOT.jar
/home/nruest/.ivy2/jars/org.scala-lang_scala-parser-combinators-2.11.0-M4.jar
/home/nruest/.ivy2/jars/com.chuusai_shapeless_2.11-2.2.5.jar
/home/nruest/.ivy2/jars/com.google.guava_guava-14.0.1.jar
/home/nruest/.ivy2/jars/org.xerial.snappy_snappy-java-1.0.5.jar
/home/nruest/.ivy2/jars/org.jsoup_jsoup-1.7.3.jar
/home/nruest/.ivy2/jars/org.netpreserve.commons_webarchive-commons-1.1.4.jar
/home/nruest/.ivy2/jars/edu.stanford.nlp_stanford-corenlp-3.4.1.jar
/home/nruest/.ivy2/jars/org.apache.tika_tika-core-1.9.jar
/home/nruest/.ivy2/jars/org.apache.tika_tika-parsers-1.9.jar
/home/nruest/.ivy2/jars/com.syncthemall_boilerpipe-1.2.2.jar
/home/nruest/.ivy2/jars/xerces_xercesImpl-2.11.0.jar
/home/nruest/.ivy2/jars/tl.lin_lintools-datatypes-1.0.0.jar
/home/nruest/.ivy2/jars/org.json_json-20131018.jar
/home/nruest/.ivy2/jars/org.htmlparser_htmlparser-1.6.jar
/home/nruest/.ivy2/jars/com.googlecode.juniversalchardet_juniversalchardet-1.0.3.jar
/home/nruest/.ivy2/jars/commons-httpclient_commons-httpclient-3.1.jar
/home/nruest/.ivy2/jars/org.apache.hadoop_hadoop-core-0.20.2-cdh3u4.jar
/home/nruest/.ivy2/jars/commons-lang_commons-lang-2.5.jar
/home/nruest/.ivy2/jars/commons-io_commons-io-2.4.jar
/home/nruest/.ivy2/jars/org.gnu.inet_libidn-1.15.jar
/home/nruest/.ivy2/jars/it.unimi.dsi_dsiutils-2.0.12.jar
/home/nruest/.ivy2/jars/org.apache.httpcomponents_httpcore-4.3.jar
/home/nruest/.ivy2/jars/com.cloudera.cdh_hadoop-ant-0.20.2-cdh3u4.jar
/home/nruest/.ivy2/jars/commons-cli_commons-cli-1.2.jar
/home/nruest/.ivy2/jars/xmlenc_xmlenc-0.52.jar
/home/nruest/.ivy2/jars/org.apache.hadoop.thirdparty.guava_guava-r09-jarjar.jar
/home/nruest/.ivy2/jars/commons-net_commons-net-1.4.1.jar
/home/nruest/.ivy2/jars/org.codehaus.jackson_jackson-core-asl-1.5.2.jar
/home/nruest/.ivy2/jars/org.codehaus.jackson_jackson-mapper-asl-1.5.2.jar
/home/nruest/.ivy2/jars/commons-el_commons-el-1.0.jar
/home/nruest/.ivy2/jars/net.java.dev.jets3t_jets3t-0.6.1.jar
/home/nruest/.ivy2/jars/oro_oro-2.0.8.jar
/home/nruest/.ivy2/jars/org.eclipse.jdt_core-3.1.1.jar
/home/nruest/.ivy2/jars/it.unimi.dsi_fastutil-6.5.2.jar
/home/nruest/.ivy2/jars/com.martiansoftware_jsap-2.1.jar
/home/nruest/.ivy2/jars/log4j_log4j-1.2.17.jar
/home/nruest/.ivy2/jars/commons-configuration_commons-configuration-1.8.jar
/home/nruest/.ivy2/jars/commons-collections_commons-collections-3.2.1.jar
/home/nruest/.ivy2/jars/org.apache.commons_commons-math3-3.1.1.jar
/home/nruest/.ivy2/jars/commons-logging_commons-logging-1.1.1.jar
/home/nruest/.ivy2/jars/com.io7m.xom_xom-1.2.10.jar
/home/nruest/.ivy2/jars/de.jollyday_jollyday-0.4.7.jar
/home/nruest/.ivy2/jars/com.googlecode.efficient-java-matrix-library_ejml-0.23.jar
/home/nruest/.ivy2/jars/javax.json_javax.json-api-1.0.jar
/home/nruest/.ivy2/jars/xalan_xalan-2.7.0.jar
/home/nruest/.ivy2/jars/javax.xml.bind_jaxb-api-2.2.7.jar
/home/nruest/.ivy2/jars/org.gagravarr_vorbis-java-tika-0.6.jar
/home/nruest/.ivy2/jars/net.sourceforge.jmatio_jmatio-1.0.jar
/home/nruest/.ivy2/jars/org.apache.james_apache-mime4j-core-0.7.2.jar
/home/nruest/.ivy2/jars/org.apache.james_apache-mime4j-dom-0.7.2.jar
/home/nruest/.ivy2/jars/org.apache.commons_commons-compress-1.9.jar
/home/nruest/.ivy2/jars/org.tukaani_xz-1.5.jar
/home/nruest/.ivy2/jars/commons-codec_commons-codec-1.9.jar
/home/nruest/.ivy2/jars/org.apache.pdfbox_pdfbox-1.8.9.jar
/home/nruest/.ivy2/jars/org.bouncycastle_bcmail-jdk15on-1.52.jar
/home/nruest/.ivy2/jars/org.bouncycastle_bcprov-jdk15on-1.52.jar
/home/nruest/.ivy2/jars/org.apache.poi_poi-3.12.jar
/home/nruest/.ivy2/jars/org.apache.poi_poi-scratchpad-3.12.jar
/home/nruest/.ivy2/jars/org.apache.poi_poi-ooxml-3.12.jar
/home/nruest/.ivy2/jars/org.ccil.cowan.tagsoup_tagsoup-1.2.1.jar
/home/nruest/.ivy2/jars/org.ow2.asm_asm-debug-all-4.1.jar
/home/nruest/.ivy2/jars/com.googlecode.mp4parser_isoparser-1.0.2.jar
/home/nruest/.ivy2/jars/com.drewnoakes_metadata-extractor-2.8.0.jar
/home/nruest/.ivy2/jars/de.l3s.boilerpipe_boilerpipe-1.1.0.jar
/home/nruest/.ivy2/jars/rome_rome-1.0.jar
/home/nruest/.ivy2/jars/org.gagravarr_vorbis-java-core-0.6.jar
/home/nruest/.ivy2/jars/org.codelibs_jhighlight-1.0.2.jar
/home/nruest/.ivy2/jars/com.pff_java-libpst-0.8.1.jar
/home/nruest/.ivy2/jars/com.github.junrar_junrar-0.7.jar
/home/nruest/.ivy2/jars/org.apache.opennlp_opennlp-tools-1.5.3.jar
/home/nruest/.ivy2/jars/org.apache.commons_commons-exec-1.3.jar
/home/nruest/.ivy2/jars/com.googlecode.json-simple_json-simple-1.1.1.jar
/home/nruest/.ivy2/jars/edu.ucar_netcdf4-4.5.5.jar
/home/nruest/.ivy2/jars/edu.ucar_grib-4.5.5.jar
/home/nruest/.ivy2/jars/edu.ucar_cdm-4.5.5.jar
/home/nruest/.ivy2/jars/edu.ucar_httpservices-4.5.5.jar
/home/nruest/.ivy2/jars/org.apache.commons_commons-csv-1.0.jar
/home/nruest/.ivy2/jars/org.apache.sis.core_sis-utility-0.5.jar
/home/nruest/.ivy2/jars/org.apache.sis.storage_sis-netcdf-0.5.jar
/home/nruest/.ivy2/jars/org.apache.sis.core_sis-metadata-0.5.jar
/home/nruest/.ivy2/jars/org.opengis_geoapi-3.0.0.jar
/home/nruest/.ivy2/jars/org.apache.pdfbox_fontbox-1.8.9.jar
/home/nruest/.ivy2/jars/org.apache.pdfbox_jempbox-1.8.9.jar
/home/nruest/.ivy2/jars/org.bouncycastle_bcpkix-jdk15on-1.52.jar
/home/nruest/.ivy2/jars/org.apache.poi_poi-ooxml-schemas-3.12.jar
/home/nruest/.ivy2/jars/org.apache.xmlbeans_xmlbeans-2.6.0.jar
/home/nruest/.ivy2/jars/org.aspectj_aspectjrt-1.8.0.jar
/home/nruest/.ivy2/jars/com.adobe.xmp_xmpcore-5.1.2.jar
/home/nruest/.ivy2/jars/jdom_jdom-1.0.jar
/home/nruest/.ivy2/jars/commons-logging_commons-logging-api-1.1.jar
/home/nruest/.ivy2/jars/org.apache.commons_commons-vfs2-2.0.jar
/home/nruest/.ivy2/jars/org.apache.maven.scm_maven-scm-api-1.4.jar
/home/nruest/.ivy2/jars/org.apache.maven.scm_maven-scm-provider-svnexe-1.4.jar
/home/nruest/.ivy2/jars/org.codehaus.plexus_plexus-utils-1.5.6.jar
/home/nruest/.ivy2/jars/org.apache.maven.scm_maven-scm-provider-svn-commons-1.4.jar
/home/nruest/.ivy2/jars/regexp_regexp-1.3.jar
/home/nruest/.ivy2/jars/org.apache.opennlp_opennlp-maxent-3.0.3.jar
/home/nruest/.ivy2/jars/net.sf.jwordnet_jwnl-1.3.3.jar
/home/nruest/.ivy2/jars/junit_junit-4.11.jar
/home/nruest/.ivy2/jars/org.hamcrest_hamcrest-core-1.3.jar
/home/nruest/.ivy2/jars/net.jcip_jcip-annotations-1.0.jar
/home/nruest/.ivy2/jars/net.java.dev.jna_jna-4.1.0.jar
/home/nruest/.ivy2/jars/org.slf4j_slf4j-api-1.7.12.jar
/home/nruest/.ivy2/jars/com.google.protobuf_protobuf-java-2.5.0.jar
/home/nruest/.ivy2/jars/org.jdom_jdom2-2.0.4.jar
/home/nruest/.ivy2/jars/edu.ucar_jj2000-5.2.jar
/home/nruest/.ivy2/jars/org.itadaki_bzip2-0.9.1.jar
/home/nruest/.ivy2/jars/edu.ucar_udunits-4.5.5.jar
/home/nruest/.ivy2/jars/joda-time_joda-time-2.2.jar
/home/nruest/.ivy2/jars/org.quartz-scheduler_quartz-2.2.0.jar
/home/nruest/.ivy2/jars/net.sf.ehcache_ehcache-core-2.6.2.jar
/home/nruest/.ivy2/jars/com.beust_jcommander-1.35.jar
/home/nruest/.ivy2/jars/org.apache.httpcomponents_httpclient-4.2.6.jar
/home/nruest/.ivy2/jars/org.apache.httpcomponents_httpmime-4.2.6.jar
/home/nruest/.ivy2/jars/c3p0_c3p0-0.9.1.1.jar
/home/nruest/.ivy2/jars/javax.measure_jsr-275-0.9.3.jar
/home/nruest/.ivy2/jars/org.apache.sis.storage_sis-storage-0.5.jar
/home/nruest/.ivy2/jars/org.apache.sis.core_sis-referencing-0.5.jar
/home/nruest/.ivy2/jars/net.sourceforge.nekohtml_nekohtml-1.9.20.jar
/home/nruest/.ivy2/jars/xml-apis_xml-apis-1.4.01.jar
/home/nruest/.ivy2/jars/com.google.code.gson_gson-2.3.1.jar

Warning: Local jar /home/nruest/.ivy2/jars/com.cloudera.cdh_hadoop-ant-0.20.2-cdh3u4.jar does not exist, skipping.
2017-11-14 22:30:10,447 [main] WARN  NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2017-11-14 22:30:10,556 [main] INFO  SecurityManager - Changing view acls to: nruest
2017-11-14 22:30:10,556 [main] INFO  SecurityManager - Changing modify acls to: nruest
2017-11-14 22:30:10,557 [main] INFO  SecurityManager - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(nruest); users with modify permissions: Set(nruest)
2017-11-14 22:30:10,666 [main] INFO  HttpServer - Starting HTTP Server
2017-11-14 22:30:10,714 [main] INFO  Server - jetty-8.y.z-SNAPSHOT
2017-11-14 22:30:10,728 [main] INFO  AbstractConnector - Started SocketConnector@0.0.0.0:38934
2017-11-14 22:30:10,729 [main] INFO  Utils - Successfully started service 'HTTP class server' on port 38934.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 1.6.0
      /_/

Using Scala version 2.10.5 (OpenJDK 64-Bit Server VM, Java 1.8.0_151)
Type in expressions to have them evaluated.
Type :help for more information.
2017-11-14 22:30:13,080 [main] WARN  Utils - Your hostname, roo resolves to a loopback address: 127.0.1.1; using 10.0.1.166 instead (on interface wlp58s0)
2017-11-14 22:30:13,080 [main] WARN  Utils - Set SPARK_LOCAL_IP if you need to bind to another address
2017-11-14 22:30:13,089 [main] INFO  SparkContext - Running Spark version 1.6.0
2017-11-14 22:30:13,120 [main] INFO  SecurityManager - Changing view acls to: nruest
2017-11-14 22:30:13,121 [main] INFO  SecurityManager - Changing modify acls to: nruest
2017-11-14 22:30:13,121 [main] INFO  SecurityManager - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(nruest); users with modify permissions: Set(nruest)
2017-11-14 22:30:13,248 [main] INFO  Utils - Successfully started service 'sparkDriver' on port 45379.
2017-11-14 22:30:13,451 [sparkDriverActorSystem-akka.actor.default-dispatcher-5] INFO  Slf4jLogger - Slf4jLogger started
2017-11-14 22:30:13,496 [sparkDriverActorSystem-akka.actor.default-dispatcher-5] INFO  Remoting - Starting remoting
2017-11-14 22:30:13,594 [main] INFO  Utils - Successfully started service 'sparkDriverActorSystem' on port 35571.
2017-11-14 22:30:13,595 [sparkDriverActorSystem-akka.actor.default-dispatcher-5] INFO  Remoting - Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@10.0.1.166:35571]
2017-11-14 22:30:13,600 [main] INFO  SparkEnv - Registering MapOutputTracker
2017-11-14 22:30:13,613 [main] INFO  SparkEnv - Registering BlockManagerMaster
2017-11-14 22:30:13,621 [main] INFO  DiskBlockManager - Created local directory at /tmp/blockmgr-755dae94-147d-4c74-9b7c-974d54d3a576
2017-11-14 22:30:13,632 [main] INFO  MemoryStore - MemoryStore started with capacity 511.1 MB
2017-11-14 22:30:13,694 [main] INFO  SparkEnv - Registering OutputCommitCoordinator
2017-11-14 22:30:13,787 [main] INFO  Server - jetty-8.y.z-SNAPSHOT
2017-11-14 22:30:13,794 [main] INFO  AbstractConnector - Started SelectChannelConnector@0.0.0.0:4040
2017-11-14 22:30:13,794 [main] INFO  Utils - Successfully started service 'SparkUI' on port 4040.
2017-11-14 22:30:13,795 [main] INFO  SparkUI - Started SparkUI at http://10.0.1.166:4040
2017-11-14 22:30:13,816 [main] INFO  HttpFileServer - HTTP File server directory is /tmp/spark-1922bc25-cc69-4b01-98b9-ef26c1cdf6fb/httpd-e1767854-a335-4a1d-b472-e802298b4ff7
2017-11-14 22:30:13,816 [main] INFO  HttpServer - Starting HTTP Server
2017-11-14 22:30:13,817 [main] INFO  Server - jetty-8.y.z-SNAPSHOT
2017-11-14 22:30:13,827 [main] INFO  AbstractConnector - Started SocketConnector@0.0.0.0:35327
2017-11-14 22:30:13,828 [main] INFO  Utils - Successfully started service 'HTTP file server' on port 35327.
2017-11-14 22:30:13,843 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/io.archivesunleashed_aut-0.10.1-SNAPSHOT.jar at http://10.0.1.166:35327/jars/io.archivesunleashed_aut-0.10.1-SNAPSHOT.jar with timestamp 1510716613843
2017-11-14 22:30:13,844 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.scala-lang_scala-parser-combinators-2.11.0-M4.jar at http://10.0.1.166:35327/jars/org.scala-lang_scala-parser-combinators-2.11.0-M4.jar with timestamp 1510716613844
2017-11-14 22:30:13,847 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/com.chuusai_shapeless_2.11-2.2.5.jar at http://10.0.1.166:35327/jars/com.chuusai_shapeless_2.11-2.2.5.jar with timestamp 1510716613847
2017-11-14 22:30:13,850 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/com.google.guava_guava-14.0.1.jar at http://10.0.1.166:35327/jars/com.google.guava_guava-14.0.1.jar with timestamp 1510716613850
2017-11-14 22:30:13,855 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.xerial.snappy_snappy-java-1.0.5.jar at http://10.0.1.166:35327/jars/org.xerial.snappy_snappy-java-1.0.5.jar with timestamp 1510716613855
2017-11-14 22:30:13,857 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.jsoup_jsoup-1.7.3.jar at http://10.0.1.166:35327/jars/org.jsoup_jsoup-1.7.3.jar with timestamp 1510716613857
2017-11-14 22:30:13,861 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.netpreserve.commons_webarchive-commons-1.1.4.jar at http://10.0.1.166:35327/jars/org.netpreserve.commons_webarchive-commons-1.1.4.jar with timestamp 1510716613861
2017-11-14 22:30:13,870 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/edu.stanford.nlp_stanford-corenlp-3.4.1.jar at http://10.0.1.166:35327/jars/edu.stanford.nlp_stanford-corenlp-3.4.1.jar with timestamp 1510716613870
2017-11-14 22:30:13,871 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.apache.tika_tika-core-1.9.jar at http://10.0.1.166:35327/jars/org.apache.tika_tika-core-1.9.jar with timestamp 1510716613871
2017-11-14 22:30:13,872 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.apache.tika_tika-parsers-1.9.jar at http://10.0.1.166:35327/jars/org.apache.tika_tika-parsers-1.9.jar with timestamp 1510716613872
2017-11-14 22:30:13,873 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/com.syncthemall_boilerpipe-1.2.2.jar at http://10.0.1.166:35327/jars/com.syncthemall_boilerpipe-1.2.2.jar with timestamp 1510716613873
2017-11-14 22:30:13,875 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/xerces_xercesImpl-2.11.0.jar at http://10.0.1.166:35327/jars/xerces_xercesImpl-2.11.0.jar with timestamp 1510716613875
2017-11-14 22:30:13,875 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/tl.lin_lintools-datatypes-1.0.0.jar at http://10.0.1.166:35327/jars/tl.lin_lintools-datatypes-1.0.0.jar with timestamp 1510716613875
2017-11-14 22:30:13,876 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.json_json-20131018.jar at http://10.0.1.166:35327/jars/org.json_json-20131018.jar with timestamp 1510716613876
2017-11-14 22:30:13,876 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.htmlparser_htmlparser-1.6.jar at http://10.0.1.166:35327/jars/org.htmlparser_htmlparser-1.6.jar with timestamp 1510716613876
2017-11-14 22:30:13,877 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/com.googlecode.juniversalchardet_juniversalchardet-1.0.3.jar at http://10.0.1.166:35327/jars/com.googlecode.juniversalchardet_juniversalchardet-1.0.3.jar with timestamp 1510716613877
2017-11-14 22:30:13,877 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/commons-httpclient_commons-httpclient-3.1.jar at http://10.0.1.166:35327/jars/commons-httpclient_commons-httpclient-3.1.jar with timestamp 1510716613877
2017-11-14 22:30:13,882 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.apache.hadoop_hadoop-core-0.20.2-cdh3u4.jar at http://10.0.1.166:35327/jars/org.apache.hadoop_hadoop-core-0.20.2-cdh3u4.jar with timestamp 1510716613882
2017-11-14 22:30:13,882 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/commons-lang_commons-lang-2.5.jar at http://10.0.1.166:35327/jars/commons-lang_commons-lang-2.5.jar with timestamp 1510716613882
2017-11-14 22:30:13,883 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/commons-io_commons-io-2.4.jar at http://10.0.1.166:35327/jars/commons-io_commons-io-2.4.jar with timestamp 1510716613883
2017-11-14 22:30:13,883 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.gnu.inet_libidn-1.15.jar at http://10.0.1.166:35327/jars/org.gnu.inet_libidn-1.15.jar with timestamp 1510716613883
2017-11-14 22:30:13,884 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/it.unimi.dsi_dsiutils-2.0.12.jar at http://10.0.1.166:35327/jars/it.unimi.dsi_dsiutils-2.0.12.jar with timestamp 1510716613884
2017-11-14 22:30:13,884 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.apache.httpcomponents_httpcore-4.3.jar at http://10.0.1.166:35327/jars/org.apache.httpcomponents_httpcore-4.3.jar with timestamp 1510716613884
2017-11-14 22:30:13,885 [main] ERROR SparkContext - Jar not found at file:/home/nruest/.ivy2/jars/com.cloudera.cdh_hadoop-ant-0.20.2-cdh3u4.jar
2017-11-14 22:30:13,887 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/commons-cli_commons-cli-1.2.jar at http://10.0.1.166:35327/jars/commons-cli_commons-cli-1.2.jar with timestamp 1510716613887
2017-11-14 22:30:13,887 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/xmlenc_xmlenc-0.52.jar at http://10.0.1.166:35327/jars/xmlenc_xmlenc-0.52.jar with timestamp 1510716613887
2017-11-14 22:30:13,889 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.apache.hadoop.thirdparty.guava_guava-r09-jarjar.jar at http://10.0.1.166:35327/jars/org.apache.hadoop.thirdparty.guava_guava-r09-jarjar.jar with timestamp 1510716613889
2017-11-14 22:30:13,889 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/commons-net_commons-net-1.4.1.jar at http://10.0.1.166:35327/jars/commons-net_commons-net-1.4.1.jar with timestamp 1510716613889
2017-11-14 22:30:13,890 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.codehaus.jackson_jackson-core-asl-1.5.2.jar at http://10.0.1.166:35327/jars/org.codehaus.jackson_jackson-core-asl-1.5.2.jar with timestamp 1510716613890
2017-11-14 22:30:13,890 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.codehaus.jackson_jackson-mapper-asl-1.5.2.jar at http://10.0.1.166:35327/jars/org.codehaus.jackson_jackson-mapper-asl-1.5.2.jar with timestamp 1510716613890
2017-11-14 22:30:13,891 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/commons-el_commons-el-1.0.jar at http://10.0.1.166:35327/jars/commons-el_commons-el-1.0.jar with timestamp 1510716613891
2017-11-14 22:30:13,893 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/net.java.dev.jets3t_jets3t-0.6.1.jar at http://10.0.1.166:35327/jars/net.java.dev.jets3t_jets3t-0.6.1.jar with timestamp 1510716613893
2017-11-14 22:30:13,894 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/oro_oro-2.0.8.jar at http://10.0.1.166:35327/jars/oro_oro-2.0.8.jar with timestamp 1510716613894
2017-11-14 22:30:13,905 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.eclipse.jdt_core-3.1.1.jar at http://10.0.1.166:35327/jars/org.eclipse.jdt_core-3.1.1.jar with timestamp 1510716613905
2017-11-14 22:30:13,923 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/it.unimi.dsi_fastutil-6.5.2.jar at http://10.0.1.166:35327/jars/it.unimi.dsi_fastutil-6.5.2.jar with timestamp 1510716613923
2017-11-14 22:30:13,923 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/com.martiansoftware_jsap-2.1.jar at http://10.0.1.166:35327/jars/com.martiansoftware_jsap-2.1.jar with timestamp 1510716613923
2017-11-14 22:30:13,924 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/log4j_log4j-1.2.17.jar at http://10.0.1.166:35327/jars/log4j_log4j-1.2.17.jar with timestamp 1510716613924
2017-11-14 22:30:13,925 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/commons-configuration_commons-configuration-1.8.jar at http://10.0.1.166:35327/jars/commons-configuration_commons-configuration-1.8.jar with timestamp 1510716613924
2017-11-14 22:30:13,925 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/commons-collections_commons-collections-3.2.1.jar at http://10.0.1.166:35327/jars/commons-collections_commons-collections-3.2.1.jar with timestamp 1510716613925
2017-11-14 22:30:13,930 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.apache.commons_commons-math3-3.1.1.jar at http://10.0.1.166:35327/jars/org.apache.commons_commons-math3-3.1.1.jar with timestamp 1510716613930
2017-11-14 22:30:13,931 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/commons-logging_commons-logging-1.1.1.jar at http://10.0.1.166:35327/jars/commons-logging_commons-logging-1.1.1.jar with timestamp 1510716613931
2017-11-14 22:30:13,933 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/com.io7m.xom_xom-1.2.10.jar at http://10.0.1.166:35327/jars/com.io7m.xom_xom-1.2.10.jar with timestamp 1510716613933
2017-11-14 22:30:13,934 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/de.jollyday_jollyday-0.4.7.jar at http://10.0.1.166:35327/jars/de.jollyday_jollyday-0.4.7.jar with timestamp 1510716613934
2017-11-14 22:30:13,936 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/com.googlecode.efficient-java-matrix-library_ejml-0.23.jar at http://10.0.1.166:35327/jars/com.googlecode.efficient-java-matrix-library_ejml-0.23.jar with timestamp 1510716613936
2017-11-14 22:30:13,937 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/javax.json_javax.json-api-1.0.jar at http://10.0.1.166:35327/jars/javax.json_javax.json-api-1.0.jar with timestamp 1510716613937
2017-11-14 22:30:13,942 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/xalan_xalan-2.7.0.jar at http://10.0.1.166:35327/jars/xalan_xalan-2.7.0.jar with timestamp 1510716613942
2017-11-14 22:30:13,943 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/javax.xml.bind_jaxb-api-2.2.7.jar at http://10.0.1.166:35327/jars/javax.xml.bind_jaxb-api-2.2.7.jar with timestamp 1510716613943
2017-11-14 22:30:13,943 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.gagravarr_vorbis-java-tika-0.6.jar at http://10.0.1.166:35327/jars/org.gagravarr_vorbis-java-tika-0.6.jar with timestamp 1510716613943
2017-11-14 22:30:13,943 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/net.sourceforge.jmatio_jmatio-1.0.jar at http://10.0.1.166:35327/jars/net.sourceforge.jmatio_jmatio-1.0.jar with timestamp 1510716613943
2017-11-14 22:30:13,943 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.apache.james_apache-mime4j-core-0.7.2.jar at http://10.0.1.166:35327/jars/org.apache.james_apache-mime4j-core-0.7.2.jar with timestamp 1510716613943
2017-11-14 22:30:13,943 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.apache.james_apache-mime4j-dom-0.7.2.jar at http://10.0.1.166:35327/jars/org.apache.james_apache-mime4j-dom-0.7.2.jar with timestamp 1510716613943
2017-11-14 22:30:13,944 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.apache.commons_commons-compress-1.9.jar at http://10.0.1.166:35327/jars/org.apache.commons_commons-compress-1.9.jar with timestamp 1510716613944
2017-11-14 22:30:13,944 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.tukaani_xz-1.5.jar at http://10.0.1.166:35327/jars/org.tukaani_xz-1.5.jar with timestamp 1510716613944
2017-11-14 22:30:13,944 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/commons-codec_commons-codec-1.9.jar at http://10.0.1.166:35327/jars/commons-codec_commons-codec-1.9.jar with timestamp 1510716613944
2017-11-14 22:30:13,947 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.apache.pdfbox_pdfbox-1.8.9.jar at http://10.0.1.166:35327/jars/org.apache.pdfbox_pdfbox-1.8.9.jar with timestamp 1510716613947
2017-11-14 22:30:13,947 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.bouncycastle_bcmail-jdk15on-1.52.jar at http://10.0.1.166:35327/jars/org.bouncycastle_bcmail-jdk15on-1.52.jar with timestamp 1510716613947
2017-11-14 22:30:13,950 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.bouncycastle_bcprov-jdk15on-1.52.jar at http://10.0.1.166:35327/jars/org.bouncycastle_bcprov-jdk15on-1.52.jar with timestamp 1510716613950
2017-11-14 22:30:13,952 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.apache.poi_poi-3.12.jar at http://10.0.1.166:35327/jars/org.apache.poi_poi-3.12.jar with timestamp 1510716613952
2017-11-14 22:30:13,953 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.apache.poi_poi-scratchpad-3.12.jar at http://10.0.1.166:35327/jars/org.apache.poi_poi-scratchpad-3.12.jar with timestamp 1510716613953
2017-11-14 22:30:13,955 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.apache.poi_poi-ooxml-3.12.jar at http://10.0.1.166:35327/jars/org.apache.poi_poi-ooxml-3.12.jar with timestamp 1510716613955
2017-11-14 22:30:13,955 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.ccil.cowan.tagsoup_tagsoup-1.2.1.jar at http://10.0.1.166:35327/jars/org.ccil.cowan.tagsoup_tagsoup-1.2.1.jar with timestamp 1510716613955
2017-11-14 22:30:13,956 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.ow2.asm_asm-debug-all-4.1.jar at http://10.0.1.166:35327/jars/org.ow2.asm_asm-debug-all-4.1.jar with timestamp 1510716613956
2017-11-14 22:30:13,956 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/com.googlecode.mp4parser_isoparser-1.0.2.jar at http://10.0.1.166:35327/jars/com.googlecode.mp4parser_isoparser-1.0.2.jar with timestamp 1510716613956
2017-11-14 22:30:13,957 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/com.drewnoakes_metadata-extractor-2.8.0.jar at http://10.0.1.166:35327/jars/com.drewnoakes_metadata-extractor-2.8.0.jar with timestamp 1510716613957
2017-11-14 22:30:13,957 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/de.l3s.boilerpipe_boilerpipe-1.1.0.jar at http://10.0.1.166:35327/jars/de.l3s.boilerpipe_boilerpipe-1.1.0.jar with timestamp 1510716613957
2017-11-14 22:30:13,957 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/rome_rome-1.0.jar at http://10.0.1.166:35327/jars/rome_rome-1.0.jar with timestamp 1510716613957
2017-11-14 22:30:13,958 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.gagravarr_vorbis-java-core-0.6.jar at http://10.0.1.166:35327/jars/org.gagravarr_vorbis-java-core-0.6.jar with timestamp 1510716613958
2017-11-14 22:30:13,958 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.codelibs_jhighlight-1.0.2.jar at http://10.0.1.166:35327/jars/org.codelibs_jhighlight-1.0.2.jar with timestamp 1510716613958
2017-11-14 22:30:13,958 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/com.pff_java-libpst-0.8.1.jar at http://10.0.1.166:35327/jars/com.pff_java-libpst-0.8.1.jar with timestamp 1510716613958
2017-11-14 22:30:13,958 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/com.github.junrar_junrar-0.7.jar at http://10.0.1.166:35327/jars/com.github.junrar_junrar-0.7.jar with timestamp 1510716613958
2017-11-14 22:30:13,959 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.apache.opennlp_opennlp-tools-1.5.3.jar at http://10.0.1.166:35327/jars/org.apache.opennlp_opennlp-tools-1.5.3.jar with timestamp 1510716613959
2017-11-14 22:30:13,960 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.apache.commons_commons-exec-1.3.jar at http://10.0.1.166:35327/jars/org.apache.commons_commons-exec-1.3.jar with timestamp 1510716613960
2017-11-14 22:30:13,960 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/com.googlecode.json-simple_json-simple-1.1.1.jar at http://10.0.1.166:35327/jars/com.googlecode.json-simple_json-simple-1.1.1.jar with timestamp 1510716613960
2017-11-14 22:30:13,960 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/edu.ucar_netcdf4-4.5.5.jar at http://10.0.1.166:35327/jars/edu.ucar_netcdf4-4.5.5.jar with timestamp 1510716613960
2017-11-14 22:30:13,974 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/edu.ucar_grib-4.5.5.jar at http://10.0.1.166:35327/jars/edu.ucar_grib-4.5.5.jar with timestamp 1510716613974
2017-11-14 22:30:13,979 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/edu.ucar_cdm-4.5.5.jar at http://10.0.1.166:35327/jars/edu.ucar_cdm-4.5.5.jar with timestamp 1510716613979
2017-11-14 22:30:13,979 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/edu.ucar_httpservices-4.5.5.jar at http://10.0.1.166:35327/jars/edu.ucar_httpservices-4.5.5.jar with timestamp 1510716613979
2017-11-14 22:30:13,980 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.apache.commons_commons-csv-1.0.jar at http://10.0.1.166:35327/jars/org.apache.commons_commons-csv-1.0.jar with timestamp 1510716613980
2017-11-14 22:30:13,980 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.apache.sis.core_sis-utility-0.5.jar at http://10.0.1.166:35327/jars/org.apache.sis.core_sis-utility-0.5.jar with timestamp 1510716613980
2017-11-14 22:30:13,980 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.apache.sis.storage_sis-netcdf-0.5.jar at http://10.0.1.166:35327/jars/org.apache.sis.storage_sis-netcdf-0.5.jar with timestamp 1510716613980
2017-11-14 22:30:13,981 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.apache.sis.core_sis-metadata-0.5.jar at http://10.0.1.166:35327/jars/org.apache.sis.core_sis-metadata-0.5.jar with timestamp 1510716613981
2017-11-14 22:30:13,981 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.opengis_geoapi-3.0.0.jar at http://10.0.1.166:35327/jars/org.opengis_geoapi-3.0.0.jar with timestamp 1510716613981
2017-11-14 22:30:13,981 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.apache.pdfbox_fontbox-1.8.9.jar at http://10.0.1.166:35327/jars/org.apache.pdfbox_fontbox-1.8.9.jar with timestamp 1510716613981
2017-11-14 22:30:13,981 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.apache.pdfbox_jempbox-1.8.9.jar at http://10.0.1.166:35327/jars/org.apache.pdfbox_jempbox-1.8.9.jar with timestamp 1510716613981
2017-11-14 22:30:13,982 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.bouncycastle_bcpkix-jdk15on-1.52.jar at http://10.0.1.166:35327/jars/org.bouncycastle_bcpkix-jdk15on-1.52.jar with timestamp 1510716613982
2017-11-14 22:30:13,986 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.apache.poi_poi-ooxml-schemas-3.12.jar at http://10.0.1.166:35327/jars/org.apache.poi_poi-ooxml-schemas-3.12.jar with timestamp 1510716613986
2017-11-14 22:30:13,988 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.apache.xmlbeans_xmlbeans-2.6.0.jar at http://10.0.1.166:35327/jars/org.apache.xmlbeans_xmlbeans-2.6.0.jar with timestamp 1510716613988
2017-11-14 22:30:13,988 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.aspectj_aspectjrt-1.8.0.jar at http://10.0.1.166:35327/jars/org.aspectj_aspectjrt-1.8.0.jar with timestamp 1510716613988
2017-11-14 22:30:13,989 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/com.adobe.xmp_xmpcore-5.1.2.jar at http://10.0.1.166:35327/jars/com.adobe.xmp_xmpcore-5.1.2.jar with timestamp 1510716613989
2017-11-14 22:30:13,989 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/jdom_jdom-1.0.jar at http://10.0.1.166:35327/jars/jdom_jdom-1.0.jar with timestamp 1510716613989
2017-11-14 22:30:13,989 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/commons-logging_commons-logging-api-1.1.jar at http://10.0.1.166:35327/jars/commons-logging_commons-logging-api-1.1.jar with timestamp 1510716613989
2017-11-14 22:30:13,990 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.apache.commons_commons-vfs2-2.0.jar at http://10.0.1.166:35327/jars/org.apache.commons_commons-vfs2-2.0.jar with timestamp 1510716613990
2017-11-14 22:30:13,990 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.apache.maven.scm_maven-scm-api-1.4.jar at http://10.0.1.166:35327/jars/org.apache.maven.scm_maven-scm-api-1.4.jar with timestamp 1510716613990
2017-11-14 22:30:13,990 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.apache.maven.scm_maven-scm-provider-svnexe-1.4.jar at http://10.0.1.166:35327/jars/org.apache.maven.scm_maven-scm-provider-svnexe-1.4.jar with timestamp 1510716613990
2017-11-14 22:30:13,990 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.codehaus.plexus_plexus-utils-1.5.6.jar at http://10.0.1.166:35327/jars/org.codehaus.plexus_plexus-utils-1.5.6.jar with timestamp 1510716613990
2017-11-14 22:30:13,991 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.apache.maven.scm_maven-scm-provider-svn-commons-1.4.jar at http://10.0.1.166:35327/jars/org.apache.maven.scm_maven-scm-provider-svn-commons-1.4.jar with timestamp 1510716613991
2017-11-14 22:30:13,991 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/regexp_regexp-1.3.jar at http://10.0.1.166:35327/jars/regexp_regexp-1.3.jar with timestamp 1510716613991
2017-11-14 22:30:13,991 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.apache.opennlp_opennlp-maxent-3.0.3.jar at http://10.0.1.166:35327/jars/org.apache.opennlp_opennlp-maxent-3.0.3.jar with timestamp 1510716613991
2017-11-14 22:30:13,991 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/net.sf.jwordnet_jwnl-1.3.3.jar at http://10.0.1.166:35327/jars/net.sf.jwordnet_jwnl-1.3.3.jar with timestamp 1510716613991
2017-11-14 22:30:13,992 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/junit_junit-4.11.jar at http://10.0.1.166:35327/jars/junit_junit-4.11.jar with timestamp 1510716613992
2017-11-14 22:30:13,992 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.hamcrest_hamcrest-core-1.3.jar at http://10.0.1.166:35327/jars/org.hamcrest_hamcrest-core-1.3.jar with timestamp 1510716613992
2017-11-14 22:30:13,992 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/net.jcip_jcip-annotations-1.0.jar at http://10.0.1.166:35327/jars/net.jcip_jcip-annotations-1.0.jar with timestamp 1510716613992
2017-11-14 22:30:13,993 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/net.java.dev.jna_jna-4.1.0.jar at http://10.0.1.166:35327/jars/net.java.dev.jna_jna-4.1.0.jar with timestamp 1510716613993
2017-11-14 22:30:13,993 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.slf4j_slf4j-api-1.7.12.jar at http://10.0.1.166:35327/jars/org.slf4j_slf4j-api-1.7.12.jar with timestamp 1510716613993
2017-11-14 22:30:13,993 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/com.google.protobuf_protobuf-java-2.5.0.jar at http://10.0.1.166:35327/jars/com.google.protobuf_protobuf-java-2.5.0.jar with timestamp 1510716613993
2017-11-14 22:30:13,993 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.jdom_jdom2-2.0.4.jar at http://10.0.1.166:35327/jars/org.jdom_jdom2-2.0.4.jar with timestamp 1510716613993
2017-11-14 22:30:13,994 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/edu.ucar_jj2000-5.2.jar at http://10.0.1.166:35327/jars/edu.ucar_jj2000-5.2.jar with timestamp 1510716613994
2017-11-14 22:30:13,994 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.itadaki_bzip2-0.9.1.jar at http://10.0.1.166:35327/jars/org.itadaki_bzip2-0.9.1.jar with timestamp 1510716613994
2017-11-14 22:30:13,994 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/edu.ucar_udunits-4.5.5.jar at http://10.0.1.166:35327/jars/edu.ucar_udunits-4.5.5.jar with timestamp 1510716613994
2017-11-14 22:30:13,995 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/joda-time_joda-time-2.2.jar at http://10.0.1.166:35327/jars/joda-time_joda-time-2.2.jar with timestamp 1510716613995
2017-11-14 22:30:13,997 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.quartz-scheduler_quartz-2.2.0.jar at http://10.0.1.166:35327/jars/org.quartz-scheduler_quartz-2.2.0.jar with timestamp 1510716613996
2017-11-14 22:30:13,999 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/net.sf.ehcache_ehcache-core-2.6.2.jar at http://10.0.1.166:35327/jars/net.sf.ehcache_ehcache-core-2.6.2.jar with timestamp 1510716613999
2017-11-14 22:30:13,999 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/com.beust_jcommander-1.35.jar at http://10.0.1.166:35327/jars/com.beust_jcommander-1.35.jar with timestamp 1510716613999
2017-11-14 22:30:14,000 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.apache.httpcomponents_httpclient-4.2.6.jar at http://10.0.1.166:35327/jars/org.apache.httpcomponents_httpclient-4.2.6.jar with timestamp 1510716614000
2017-11-14 22:30:14,000 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.apache.httpcomponents_httpmime-4.2.6.jar at http://10.0.1.166:35327/jars/org.apache.httpcomponents_httpmime-4.2.6.jar with timestamp 1510716614000
2017-11-14 22:30:14,002 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/c3p0_c3p0-0.9.1.1.jar at http://10.0.1.166:35327/jars/c3p0_c3p0-0.9.1.1.jar with timestamp 1510716614002
2017-11-14 22:30:14,003 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/javax.measure_jsr-275-0.9.3.jar at http://10.0.1.166:35327/jars/javax.measure_jsr-275-0.9.3.jar with timestamp 1510716614003
2017-11-14 22:30:14,004 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.apache.sis.storage_sis-storage-0.5.jar at http://10.0.1.166:35327/jars/org.apache.sis.storage_sis-storage-0.5.jar with timestamp 1510716614004
2017-11-14 22:30:14,006 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/org.apache.sis.core_sis-referencing-0.5.jar at http://10.0.1.166:35327/jars/org.apache.sis.core_sis-referencing-0.5.jar with timestamp 1510716614006
2017-11-14 22:30:14,007 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/net.sourceforge.nekohtml_nekohtml-1.9.20.jar at http://10.0.1.166:35327/jars/net.sourceforge.nekohtml_nekohtml-1.9.20.jar with timestamp 1510716614007
2017-11-14 22:30:14,008 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/xml-apis_xml-apis-1.4.01.jar at http://10.0.1.166:35327/jars/xml-apis_xml-apis-1.4.01.jar with timestamp 1510716614008
2017-11-14 22:30:14,009 [main] INFO  SparkContext - Added JAR file:/home/nruest/.ivy2/jars/com.google.code.gson_gson-2.3.1.jar at http://10.0.1.166:35327/jars/com.google.code.gson_gson-2.3.1.jar with timestamp 1510716614009
2017-11-14 22:30:14,048 [main] INFO  Executor - Starting executor ID driver on host localhost
2017-11-14 22:30:14,051 [main] INFO  Executor - Using REPL class URI: http://10.0.1.166:38934
2017-11-14 22:30:14,059 [main] INFO  Utils - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 38305.
2017-11-14 22:30:14,060 [main] INFO  NettyBlockTransferService - Server created on 38305
2017-11-14 22:30:14,060 [main] INFO  BlockManagerMaster - Trying to register BlockManager
2017-11-14 22:30:14,063 [dispatcher-event-loop-2] INFO  BlockManagerMasterEndpoint - Registering block manager localhost:38305 with 511.1 MB RAM, BlockManagerId(driver, localhost, 38305)
2017-11-14 22:30:14,067 [main] INFO  BlockManagerMaster - Registered BlockManager
2017-11-14 22:30:14,428 [main] INFO  SparkILoop - Created spark context..
Spark context available as sc.
2017-11-14 22:30:14,845 [main] INFO  HiveContext - Initializing execution hive, version 1.2.1
2017-11-14 22:30:14,894 [main] INFO  ClientWrapper - Inspected Hadoop version: 2.6.0
2017-11-14 22:30:14,894 [main] INFO  ClientWrapper - Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.6.0
2017-11-14 22:30:15,139 [main] INFO  HiveMetaStore - 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
2017-11-14 22:30:15,167 [main] INFO  ObjectStore - ObjectStore, initialize called
2017-11-14 22:30:15,290 [main] INFO  Persistence - Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
2017-11-14 22:30:15,291 [main] INFO  Persistence - Property datanucleus.cache.level2 unknown - will be ignored
2017-11-14 22:30:17,818 [main] INFO  ObjectStore - Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
2017-11-14 22:30:19,384 [main] INFO  Datastore - The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
2017-11-14 22:30:19,385 [main] INFO  Datastore - The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
2017-11-14 22:30:21,126 [main] INFO  Datastore - The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
2017-11-14 22:30:21,126 [main] INFO  Datastore - The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
2017-11-14 22:30:21,541 [main] INFO  MetaStoreDirectSql - Using direct SQL, underlying DB is DERBY
2017-11-14 22:30:21,544 [main] INFO  ObjectStore - Initialized ObjectStore
2017-11-14 22:30:21,683 [main] WARN  ObjectStore - Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
2017-11-14 22:30:21,810 [main] WARN  ObjectStore - Failed to get database default, returning NoSuchObjectException
2017-11-14 22:30:22,127 [main] INFO  HiveMetaStore - Added admin role in metastore
2017-11-14 22:30:22,135 [main] INFO  HiveMetaStore - Added public role in metastore
2017-11-14 22:30:22,254 [main] INFO  HiveMetaStore - No user is added in admin role, since config is empty
2017-11-14 22:30:22,353 [main] INFO  HiveMetaStore - 0: get_all_databases
2017-11-14 22:30:22,354 [main] INFO  audit - ugi=nruest ip=unknown-ip-addr  cmd=get_all_databases   
2017-11-14 22:30:22,367 [main] INFO  HiveMetaStore - 0: get_functions: db=default pat=*
2017-11-14 22:30:22,367 [main] INFO  audit - ugi=nruest ip=unknown-ip-addr  cmd=get_functions: db=default pat=* 
2017-11-14 22:30:22,369 [main] INFO  Datastore - The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.
2017-11-14 22:30:22,691 [main] INFO  SessionState - Created local directory: /tmp/33863d38-bbec-4225-859d-10c28589ae7e_resources
2017-11-14 22:30:22,698 [main] INFO  SessionState - Created HDFS directory: /tmp/hive/nruest/33863d38-bbec-4225-859d-10c28589ae7e
2017-11-14 22:30:22,702 [main] INFO  SessionState - Created local directory: /tmp/nruest/33863d38-bbec-4225-859d-10c28589ae7e
2017-11-14 22:30:22,712 [main] INFO  SessionState - Created HDFS directory: /tmp/hive/nruest/33863d38-bbec-4225-859d-10c28589ae7e/_tmp_space.db
2017-11-14 22:30:22,778 [main] INFO  HiveContext - default warehouse location is /user/hive/warehouse
2017-11-14 22:30:22,783 [main] INFO  HiveContext - Initializing HiveMetastoreConnection version 1.2.1 using Spark classes.
2017-11-14 22:30:22,792 [main] INFO  ClientWrapper - Inspected Hadoop version: 2.6.0
2017-11-14 22:30:22,799 [main] INFO  ClientWrapper - Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.6.0
2017-11-14 22:30:23,197 [main] INFO  HiveMetaStore - 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
2017-11-14 22:30:23,227 [main] INFO  ObjectStore - ObjectStore, initialize called
2017-11-14 22:30:23,308 [main] INFO  Persistence - Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
2017-11-14 22:30:23,308 [main] INFO  Persistence - Property datanucleus.cache.level2 unknown - will be ignored
2017-11-14 22:30:25,362 [main] INFO  ObjectStore - Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
2017-11-14 22:30:26,981 [main] INFO  Datastore - The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
2017-11-14 22:30:26,982 [main] INFO  Datastore - The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
2017-11-14 22:30:28,872 [main] INFO  Datastore - The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
2017-11-14 22:30:28,872 [main] INFO  Datastore - The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
2017-11-14 22:30:29,320 [main] INFO  MetaStoreDirectSql - Using direct SQL, underlying DB is DERBY
2017-11-14 22:30:29,323 [main] INFO  ObjectStore - Initialized ObjectStore
2017-11-14 22:30:29,460 [main] WARN  ObjectStore - Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
2017-11-14 22:30:29,619 [main] WARN  ObjectStore - Failed to get database default, returning NoSuchObjectException
2017-11-14 22:30:29,763 [main] INFO  HiveMetaStore - Added admin role in metastore
2017-11-14 22:30:29,769 [main] INFO  HiveMetaStore - Added public role in metastore
2017-11-14 22:30:29,867 [main] INFO  HiveMetaStore - No user is added in admin role, since config is empty
2017-11-14 22:30:29,953 [main] INFO  HiveMetaStore - 0: get_all_databases
2017-11-14 22:30:29,954 [main] INFO  audit - ugi=nruest ip=unknown-ip-addr  cmd=get_all_databases   
2017-11-14 22:30:29,973 [main] INFO  HiveMetaStore - 0: get_functions: db=default pat=*
2017-11-14 22:30:29,973 [main] INFO  audit - ugi=nruest ip=unknown-ip-addr  cmd=get_functions: db=default pat=* 
2017-11-14 22:30:29,974 [main] INFO  Datastore - The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.
2017-11-14 22:30:30,213 [main] INFO  SessionState - Created local directory: /tmp/ab66b9e9-dad9-4b7a-ad7a-6497bb887a07_resources
2017-11-14 22:30:30,226 [main] INFO  SessionState - Created HDFS directory: /tmp/hive/nruest/ab66b9e9-dad9-4b7a-ad7a-6497bb887a07
2017-11-14 22:30:30,231 [main] INFO  SessionState - Created local directory: /tmp/nruest/ab66b9e9-dad9-4b7a-ad7a-6497bb887a07
2017-11-14 22:30:30,238 [main] INFO  SessionState - Created HDFS directory: /tmp/hive/nruest/ab66b9e9-dad9-4b7a-ad7a-6497bb887a07/_tmp_space.db
2017-11-14 22:30:30,257 [main] INFO  SparkILoop - Created sql context (with Hive support)..
SQL context available as sqlContext.

scala> 
ruebot commented 6 years ago

If we do move forward with issue-113-a, we'll need address this:

[WARNING] /home/nruest/git/aut/src/main/scala/io/archivesunleashed/spark/matchbox/DetectLanguage.scala:24: warning: class LanguageIdentifier in package language is deprecated: see corresponding Javadoc for more information.
[INFO]     else new LanguageIdentifier(input).getLanguage
[INFO]              ^
[WARNING] one warning found
[INFO] prepare-compile in 0 s
[INFO] compile in 7 s
[INFO] 

This looks to be tied to https://github.com/milessabin/macro-compat cutting another release.

[WARNING]  Expected all dependencies to require Scala version: 2.11.8
[WARNING]  org.scalatest:scalatest_2.11:3.0.1 requires scala version: 2.11.8
[WARNING]  org.scalactic:scalactic_2.11:3.0.1 requires scala version: 2.11.8
[WARNING]  io.archivesunleashed:aut:0.10.1-SNAPSHOT requires scala version: 2.11.8
[WARNING]  com.chuusai:shapeless_2.11:2.3.2 requires scala version: 2.11.8
[WARNING]  org.typelevel:macro-compat_2.11:1.1.1 requires scala version: 2.11.7
[WARNING] Multiple versions of scala libraries detected!
[INFO] includes = [**/*.java,**/*.scala,]
[INFO] excludes = []
[INFO] /home/nruest/git/aut/src/main/java:-1: info: compiling
[INFO] /home/nruest/git/aut/src/main/scala:-1: info: compiling
[INFO] Compiling 39 source files to /home/nruest/git/aut/target/classes at 1510710743496
dportabella commented 6 years ago

I re-executed the script aboce (replacing git checkout issue-111 by git checkout issue-113-a), and it still fails with:

:: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
Exception in thread "main" java.lang.RuntimeException: [download failed: org.apache.commons#commons-lang3;3.3.1!commons-lang3.jar, download failed: javax.servlet#javax.servlet-api;3.0.1!javax.servlet-api.jar, download failed: org.slf4j#slf4j-api;1.7.24!slf4j-api.jar, download failed: org.slf4j#jul-to-slf4j;1.7.24!jul-to-slf4j.jar, download failed: org.slf4j#jcl-over-slf4j;1.7.24!jcl-over-slf4j.jar, download failed: commons-logging#commons-logging;1.1.3!commons-logging.jar]

if you don't use the script above, remember to remove your dependency cache (~/.m2 and ~/.ivy2).

lintool commented 6 years ago

[download failed: org.apache.commons#commons-lang3;3.3.1!commons-lang3.jar, download failed: javax.servlet#javax.servlet-api;3.0.1!javax.servlet-api.jar, download failed: org.slf4j#slf4j-api;1.7.24!slf4j-api.jar, download failed: org.slf4j#jul-to-slf4j;1.7.24!jul-to-slf4j.jar, download failed: org.slf4j#jcl-over-slf4j;1.7.24!jcl-over-slf4j.jar, download failed: commons-logging#commons-logging;1.1.3!commons-logging.jar]

Isn't the artifact here?

https://search.maven.org/#artifactdetails%7Corg.apache.commons%7Ccommons-lang3%7C3.3.1%7Cjar

ruebot commented 6 years ago

@dportabella huh, I think you're right. I just tried this on a different machine and I got the commons error.

Give me a bit, and hopefully I'll push a commit that gets us further along :smile:

ruebot commented 6 years ago

@dportabella pushed up a new commit, and it appears to work on my end. This is how I'm testing it it. This is concerning, so curious how it goes on your end.

dportabella commented 6 years ago

hi, i've tested it according to the script on this question (using docker).

I do get the same error message ERROR spark.SparkContext: Jar not found at file:/root/.ivy2/jars/com.cloudera.cdh_hadoop-ant-0.20.2-cdh3u4.jar

however, spark-shell inits correctly and it seems to work. thx! :)

ruebot commented 6 years ago

@dportabella that's great! We'll do some testing on our end, and if everything goes well, I'll cut another release.

ruebot commented 6 years ago

@dportabella 0.11.0 should be propagating now. Please let us know if you run into any issues, and thanks again for letting us know about this!

dportabella commented 6 years ago

Hi, it seems that the problem is not solved yet. :( It was working with the procedure above (in the question) using the issue-113-a branch, but it's not working on the release.

see this procedure:

$ docker run -it -p 8088:8088 -p 8042:8042 -h sandbox sequenceiq/spark:1.6.0 bash
$ spark-shell --verbose --packages "io.archivesunleashed:aut:0.11.0"
Exception in thread "main" java.lang.RuntimeException: [unresolved dependency: org.apache.hadoop#hadoop-core;0.20.2-cdh3u4: not found]
ruebot commented 6 years ago

@dportabella yeah, I caught that in https://github.com/archivesunleashed/aut/pull/119. But noticed some more when trying to update docker-aut to use --packages instead of --jars. I'm going to spend some time with it later today and see if we can resolve this once and for all.

ruebot commented 6 years ago

Fun thing:

:: problems summary ::
:::: WARNINGS
        [NOT FOUND  ] net.sourceforge.f2j#arpack_combined_all;0.1!arpack_combined_all.jar (0ms)

    ==== local-m2-cache: tried

      file:/root/.m2/repository/net/sourceforge/f2j/arpack_combined_all/0.1/arpack_combined_all-0.1-javadoc.jar

        ::::::::::::::::::::::::::::::::::::::::::::::

        ::              FAILED DOWNLOADS            ::

        :: ^ see resolution messages for details  ^ ::

        ::::::::::::::::::::::::::::::::::::::::::::::

        :: net.sourceforge.f2j#arpack_combined_all;0.1!arpack_combined_all.jar

        ::::::::::::::::::::::::::::::::::::::::::::::

arpack_combined_all.jar doesn't have the version number in the filename, and it doesn't look like you can use <configuration> and <fileNameMapping> on <dependency>. http://central.maven.org/maven2/net/sourceforge/f2j/arpack_combined_all/0.1/

ruebot commented 6 years ago

It's in the top level of the dependency tree too, so not sure if we can exclude it or not: [INFO] +- net.sourceforge.f2j:arpack_combined_all:jar:0.1:compile

ruebot commented 6 years ago

Got it: https://gist.github.com/ruebot/bd6a6b2993d21171eda839725cc66438

I'll put in a PR in here in a few, and merge another PR, then cut a new release and we should be good.

ruebot commented 6 years ago

@dportabella 0.12.0 should take care of it for you. It's working perfectly for me with https://github.com/archivesunleashed/docker-aut/blob/0.12.0/Dockerfile#L28

If I get some time in the future, I'd like to go through the dependency graph with a hacksaw.

dportabella commented 6 years ago

It fails with spark 1.6.0 as follows. I'll try to find a docker image wirh spark 2.

$ docker run -it -p 8088:8088 -p 8042:8042 -h sandbox sequenceiq/spark:1.6.0 bash
$ spark-shell --verbose --packages "io.archivesunleashed:aut:0.12.0"
Exception in thread "main" java.lang.RuntimeException: [unresolved dependency: org.apache.hadoop#hadoop-core;0.20.2-cdh3u4: not found]

Using Scala version 2.10.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_51)
Type in expressions to have them evaluated.
Type :help for more information.
error: bad symbolic reference. A signature in package.class refers to type compileTimeOnly
in package scala.annotation which is not available.
It may be completely missing from the current classpath, or the version on
the classpath might be incompatible with the version used when compiling package.class.
<console>:14: error: Reference to value sc should not have survived past type checking,
it should have been processed and eliminated during expansion of an enclosing macro.
                @transient val sc = {
                               ^
<console>:15: error: Reference to method createSQLContext in class SparkILoop should not have survived past type checking,
it should have been processed and eliminated during expansion of an enclosing macro.
                  val _sqlContext = org.apache.spark.repl.Main.interp.createSQLContext()
                                                                      ^
<console>:14: error: Reference to value sqlContext should not have survived past type checking,
it should have been processed and eliminated during expansion of an enclosing macro.
                @transient val sqlContext = {
                               ^
<console>:16: error: not found: value sqlContext
         import sqlContext.implicits._
                ^
<console>:16: error: not found: value sqlContext
         import sqlContext.sql
                ^

scala>
ruebot commented 6 years ago

@dportabella yeah, it's going to want a newer version of Spark: https://github.com/archivesunleashed/aut/blob/master/pom.xml#L25

dportabella commented 6 years ago

It works on docker! :)

$ docker run -it -p 4040:4040 -p 8080:8080 -p 8081:8081 -h spark --name=spark p7hb/docker-spark:2.0.2
$ spark-shell --verbose --packages "io.archivesunleashed:aut:0.12.0"

import io.archivesunleashed.spark.archive.io.ArchiveRecord
import io.archivesunleashed.spark.matchbox.RecordLoader

val file = "/data/myarchive.warc.gz"
val webPages = RecordLoader.loadArchives(file, sc)
webPages.foreach(archiveRecord => println(s"+++ ${archiveRecord.getUrl}"))
dportabella commented 6 years ago

I confirm that it works also with the latest spark version, 2.2.0.

dportabella commented 6 years ago

In your website: http://archivesunleashed.org/aut/ -> Getting Started, you replaced the aut-0.12.1-fatjar.jar by --packages "io.archivesunleashed:aut:0.12.1 in the spark-shell command, but you still have the "Downloading AUT" section. It looks like it is necessary to manually download it before using the spark-shell command. maybe you want to remove this section (except for the example.arc.gz file), or at least put it somewhere else.

ianmilligan1 commented 6 years ago

Thanks @dportabella – I've updated the documentation!