What steps will reproduce the problem?
1. Use Mac OS X. Follow the guide on the Project Home, the step
http://code.google.com/p/hadoop-snappy/#Install_Hadoop_Snappy_in_Hadoop
2. try to run a code that uses snappy (i.e the simplest case: hadoop in local
mode, "hadoop fs -text" for snappy-compressed SequenceFile)
3. observe "Unknown codec: org.apache.hadoop.io.compress.SnappyCodec" error
4. make sure "hadoop classpath | grep snappy" is empty
What is the expected output? What do you see instead?
Expected behavior would be presence of the snappy jar in the classpath, and
snappy native code in java.library.path
But neither of this is true on Mac OS with binary hadoop distribution.
What version of the product are you using? On what operating system?
snappy 1.0.5 on Mac OS X 10.7.4, hadoop-bin 0.20.205.0
Please provide any additional information below.
1. Hadoop 0.20.* has two distribution variants with different filesystem
layout:
* _binary_ like hadoop-0.20.205.0-bin.tar.gz
* _tarball_ with sources hadoop-0.20.205.0.tar.gz
Binary distributed hadoop DOES NOT include <HADOOP_HOME>/lib to the classpath.
So snappy jar file in this case is not available at the runtime. Instead it
uses <HADOOP_PREFIX>/share/hadoop/lib/*.jar
Looks like hadoop-snappy installation instruction needs to be updated for this
case.
2. Native code is NOT added to the java.library.path as the Hadoop naming
convention for path is "/lib/native/os.name-os.arch-sun.arch.data.model"
(org.apache.hadoop.util.PlatformName.platformName L30) which is
"Mac_OS_X-x86_64-64" but current maven build script for hadoop-snappy in case
of OS X overrides it to "Mac_OS_X-${sun.arch.data.model}" (pom.xml L278), like
Mac_OS_X-64 so hadoop run script unable to find it.
Original issue reported on code.google.com by keiw0...@gmail.com on 4 Jul 2012 at 6:53
Original issue reported on code.google.com by
keiw0...@gmail.com
on 4 Jul 2012 at 6:53