Closed agillan closed 10 years ago
Ok, so I found that I have to put the full class path to crush like so:
hadoop jar filecrush-2.2.2-SNAPSHOT.jar com.m6d.filecrush.crush.Crush /user/zslf023/pdb/all /user/zslf023/pdb/tenkcrushed 20140725112332
But now I get the following error:
Exception in thread "main" java.lang.NumberFormatException: null
at java.lang.Long.parseLong(Long.java:375)
at java.lang.Long.parseLong(Long.java:468)
at com.m6d.filecrush.crush.Crush.createJobConfAndParseArgs(Crush.java:491)
at com.m6d.filecrush.crush.Crush.run(Crush.java:595)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at com.m6d.filecrush.crush.Crush.main(Crush.java:1313)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
It seems something is wrong with this line?
dfsBlockSize = Long.parseLong(job.get("dfs.block.size"));
Do you know if this variable has changed. WE originally targeted against hadoop 0.20.2 maybe things are getting deprecated and moved around?
Looks like they changed it to dfs.blocksize in Hadoop 2.04 http://hadoop.apache.org/docs/r2.0.4-alpha/hadoop-project-dist/hadoop-common/DeprecatedProperties.html
If I change that line in the Crush code to job.get("dfs.blocksize")
, it should work, right?
Yes we should make a patch that attempts to use both variables.
Ill do that now
Thanks!
Could I also please ask you a really quick favour? I might well use this code as part of my masters thesis work, and it would really help me out if you could register your repository at this website: https://guides.github.com/activities/citable-code/ so that I can reference it properly in my bibliography. Would that be ok? Thanks again.
Cool. Yes. I will fill that out. Send me a link to the paper when it is completed.
I am running tests now. One thing to note. What I am doing is patching in this bug fix, but what we should do is upgrade filecrush to test against a newer hadoop. Because currently since we are not testing against hadoop 2.4 we are not showing that this fix actually works. That is a larger effort for another ticket but if you would like to take that on it would be great.
This fix should be merged into https://github.com/edwardcapriolo/filecrush/pull/6
https://zenodo.org/badge/doi/10.5281/zenodo.11038.png
I have a badge but my page is not md so I can not display it. Also trunk has a fix for your blocksize bug. Try it out and close issue if that is a fix.
Thanks for that! I just tried to package it and I got a build failure:
Tests run: 98, Failures: 1, Errors: 0, Skipped: 0
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 11:54 min
[INFO] Finished at: 2014-07-25T16:31:16+01:00
[INFO] Final Memory: 12M/43M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.12.4:test (default-test) on project filecrush: There are test failures.
This is the test report output:
<testcase time="0.079" classname="com.m6d.filecrush.crush.CrushTest" name="bucketing">
<failure message="
Expected: <{/var/folders/j4/m8f6sqzd3fv1k134pbqkf7j80000gn/T/junit9031605743255637183/in/2-1=0, /var/folders/j4/m8f6sqzd3fv1k134pbqkf7j80000gn/T/junit9031605743255637183/in/2/2.4/2.4.2-0=1, /var/folders/j4/m8f6sqzd3fv1k134pbqkf7j80000gn/T/junit9031605743255637183/in/1/1.1-2=4, /var/folders/j4/m8f6sqzd3fv1k134pbqkf7j80000gn/T/junit9031605743255637183/in/2/2.2-1=3, /var/folders/j4/m8f6sqzd3fv1k134pbqkf7j80000gn/T/junit9031605743255637183/in/1/1.1-0=2, /var/folders/j4/m8f6sqzd3fv1k134pbqkf7j80000gn/T/junit9031605743255637183/in/1/1.1-1=3, /var/folders/j4/m8f6sqzd3fv1k134pbqkf7j80000gn/T/junit9031605743255637183/in/1/1.2-0=4}>
got: <{/var/folders/j4/m8f6sqzd3fv1k134pbqkf7j80000gn/T/junit9031605743255637183/in/2-1=0, /var/folders/j4/m8f6sqzd3fv1k134pbqkf7j80000gn/T/junit9031605743255637183/in/2/2.4/2.4.2-0=1, /var/folders/j4/m8f6sqzd3fv1k134pbqkf7j80000gn/T/junit9031605743255637183/in/1/1.1-2=3, /var/folders/j4/m8f6sqzd3fv1k134pbqkf7j80000gn/T/junit9031605743255637183/in/2/2.2-1=3, /var/folders/j4/m8f6sqzd3fv1k134pbqkf7j80000gn/T/junit9031605743255637183/in/1/1.1-0=2, /var/folders/j4/m8f6sqzd3fv1k134pbqkf7j80000gn/T/junit9031605743255637183/in/1/1.1-1=4, /var/folders/j4/m8f6sqzd3fv1k134pbqkf7j80000gn/T/junit9031605743255637183/in/1/1.2-0=4}>
" type="java.lang.AssertionError">java.lang.AssertionError:
Expected: <{/var/folders/j4/m8f6sqzd3fv1k134pbqkf7j80000gn/T/junit9031605743255637183/in/2-1=0, /var/folders/j4/m8f6sqzd3fv1k134pbqkf7j80000gn/T/junit9031605743255637183/in/2/2.4/2.4.2-0=1, /var/folders/j4/m8f6sqzd3fv1k134pbqkf7j80000gn/T/junit9031605743255637183/in/1/1.1-2=4, /var/folders/j4/m8f6sqzd3fv1k134pbqkf7j80000gn/T/junit9031605743255637183/in/2/2.2-1=3, /var/folders/j4/m8f6sqzd3fv1k134pbqkf7j80000gn/T/junit9031605743255637183/in/1/1.1-0=2, /var/folders/j4/m8f6sqzd3fv1k134pbqkf7j80000gn/T/junit9031605743255637183/in/1/1.1-1=3, /var/folders/j4/m8f6sqzd3fv1k134pbqkf7j80000gn/T/junit9031605743255637183/in/1/1.2-0=4}>
got: <{/var/folders/j4/m8f6sqzd3fv1k134pbqkf7j80000gn/T/junit9031605743255637183/in/2-1=0, /var/folders/j4/m8f6sqzd3fv1k134pbqkf7j80000gn/T/junit9031605743255637183/in/2/2.4/2.4.2-0=1, /var/folders/j4/m8f6sqzd3fv1k134pbqkf7j80000gn/T/junit9031605743255637183/in/1/1.1-2=3, /var/folders/j4/m8f6sqzd3fv1k134pbqkf7j80000gn/T/junit9031605743255637183/in/2/2.2-1=3, /var/folders/j4/m8f6sqzd3fv1k134pbqkf7j80000gn/T/junit9031605743255637183/in/1/1.1-0=2, /var/folders/j4/m8f6sqzd3fv1k134pbqkf7j80000gn/T/junit9031605743255637183/in/1/1.1-1=4, /var/folders/j4/m8f6sqzd3fv1k134pbqkf7j80000gn/T/junit9031605743255637183/in/1/1.2-0=4}>
at org.junit.Assert.assertThat(Assert.java:778)
at org.junit.Assert.assertThat(Assert.java:736)
at com.m6d.filecrush.crush.CrushTest.bucketing(CrushTest.java:725)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:43)
at org.junit.runners.BlockJUnit4ClassRunner.runNotIgnored(BlockJUnit4ClassRunner.java:79)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:71)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:49)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)
</failure>
</testcase>
As for taking on the testing for later Hadoop versions, I'll be doing an informal run with my files now, but I'm not sure I could take on any formal testing at the moment, I'm sorry!
I had a weird error with that to. Honestly I think that test is somehow JVM sensitive. But I have not had time to dig in. I am using java version "1.7.0_45" which caused me to have to change that exact test. I made a note of it in my diff. For now I would do mvn -Dmaven.test.skip=true because I think that is just non-deterministic testing and not a bug.
Ok, great, I did that and it packages and runs on my Hadoop 2.06 cluster!
I just built filecrush and had the same (single) test failure.
]$ java -version java version "1.7.0" Java(TM) SE Runtime Environment (build pxa6470sr7-20140410_01(SR7)) IBM J9 VM (build 2.6, JRE 1.7.0 Linux amd64-64 Compressed References 20140409_195732 (JIT enabled, AOT enabled) J9VM - R26_Java726_SR7_20140409_1418_B195732 JIT - r11.b06_20140409_61252 GC - R26_Java726_SR7_20140409_1418_B195732_CMPRSS J9CL - 20140409_195732) JCL - 20140409_01 based on Oracle 7u55-b13 DEV [mclinta@vrdevamc001 surefire-reports]$ arch x86_64 DEV [mclinta@vrdevamc001 surefire-reports]$ uname -a Linux vrdevamc001.iggroup.local 2.6.32-431.20.3.el6.x86_64 #1 SMP Fri Jun 6 18:30:54 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux DEV [mclinta@vrdevamc001 surefire-reports]$
Hi,
I'm sorry if this is a dumb question, but I can't figure out how to run the file crusher on my Hadoop cluster - I keep getting a class not found error. This is the command I'm running:
hadoop jar filecrush-2.2.2-SNAPSHOT.jar Crush /user/zslf023/pdb/all /user/zslf023/pdb/tenkcrushed 201424071559
which then returns:
Could you please let me know where I'm going wrong?
Thanks in advance! Ana