Closed mboyanna closed 9 years ago
You may need to set the hadoop configuration for compression, not the spark configuration:
val hadoopConfiguration = new Configuration()
hadoopConfiguration.set("io.compression.codecs","io.sensesecure.hadoop.xz.XZCodec")
I tested on Spark 1.3.1 + Hadoop 2.6.0. If you use Spark 1.2.0, which typically bundled with Hadoop 2.4, you may also encounter other issues since hadoop-xz depends on hadoop 2.6. ---yongtang
Closed for now. Please re-open if you encounter any other issues.
The provided Spark example from README for reading xz files is returning output where:
Here's the Spark/Scala code :
def readXzfile() {
}
Here's my build file:
name := "detailed-commons"
organization := "com.mycompany.commons"
version := "1.0.2"
scalaVersion := "2.10.4"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.2.0" % "provided"
libraryDependencies += "org.apache.spark" % "spark-sql_2.10" % "1.2.0" % "provided"
libraryDependencies += "org.apache.spark" % "spark-hive_2.10" % "1.2.0" % "provided"
libraryDependencies += "com.datastax.spark" %% "spark-cassandra-connector" % "1.1.1"
libraryDependencies ++= Seq( ("io.sensesecure" % "hadoop-xz" % "1.4"). exclude("commons-beanutils", "commons-beanutils-core"). exclude("commons-collections", "commons-collections") )
publishTo := Some(Resolver.file("detailed-commons-assembly-1.0.2.jar", new File( Path.userHome.absolutePath+"/.ivy2/cache" )) )