xldrx / cloudapp-mp2

Machine Programming Assignment for Cloud Application Course
Apache License 2.0
3 stars 68 forks source link

Properly delete output file from HDFS #2

Closed macfreek closed 9 years ago

macfreek commented 9 years ago

If it is not deleted, a second run of the program raises a Java exception.

e.g.

+ hadoop jar ./internal_use/tmp/TitleCount.jar TitleCount -D stopwords=/mp2/misc/stopwords.txt -D delimiters=/mp2/misc/delimiters.txt /mp2/titles /mp2/TitleCount-output
WARNING: Use "yarn jar" to launch YARN applications.
15/09/06 18:43:11 INFO impl.TimelineClientImpl: Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/
15/09/06 18:43:11 INFO client.RMProxy: Connecting to ResourceManager at sandbox.hortonworks.com/10.0.2.15:8050
Exception in thread "main" org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://sandbox.hortonworks.com:8020/mp2/TitleCount-output already exists
    at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:146)
    at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:266)
    at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:139)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
    at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1308)
    at TitleCount.run(TitleCount.java:48)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at TitleCount.main(TitleCount.java:28)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
macfreek commented 9 years ago

I'm going to close this. While the patch does not break anything, it does not fix anything either. :)

There was another reason the directory was still there.

Sorry about the FUD.