datasalt / pangool

Tuple MapReduce for Hadoop: Hadoop API made easy
http://datasalt.github.io/pangool/
Apache License 2.0
57 stars 13 forks source link

It looks strange to create files in HDFS using a path from disk #25

Closed posa88 closed 11 years ago

posa88 commented 11 years ago

I ran the TopicalWordCount example and encountered the following problem. Apparently I didn't have sufficient privileges to create files somewhere, which is getting from "hadoop.tmp.dir" in the code(InstancesDistributor.java) if not set. However, the default of "hadoop.tmp.dir“ is "/tmp/hadoop-${user.name}", which is a path from disk. Though it works well after I set this param, but.....


Exception in thread "main" org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException: Permission denied: user=bjdata, access=WRITE, inode="home":hadoop:supergroup:rwxr-xr-x at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:96) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:58) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.(DFSClient.java:2836) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:500) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:206) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:484) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:465) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:372) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:364) at com.datasalt.pangool.utils.InstancesDistributor.distribute(InstancesDistributor.java:77) at com.datasalt.pangool.tuplemr.TupleMRBuilder.createJob(TupleMRBuilder.java:240) at com.mediav.pangoolCase.TopicalWordCount.run(TopicalWordCount.java:102) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at com.mediav.pangoolCase.TopicalWordCount.main(TopicalWordCount.java:113) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

pereferrera commented 11 years ago

The Exception appears only when using the default value of /tmp/hadoop-${user.name}, is that ?

pereferrera commented 11 years ago

I see that using /tmp in the HDFS as default for instance distribution may not be the best choice... Works well for local mode, but it the HDFS in some cases you might not have access to it and therefore your exception.

Maybe we should set HDFS_TMP_FOLDER_CONF to some user-local folder, for example ./pangool-instances

ivanprado commented 11 years ago

Sounds reasonable.

2013/1/22 Pere Ferrera notifications@github.com

I see that using /tmp in the HDFS as default for instance distribution may not be the best choice... Works well for local mode, but it the HDFS in some cases you might not have access to it and therefore your exception.

Maybe we should set HDFS_TMP_FOLDER_CONF to some user-local folder, for example ./pangool-instances

— Reply to this email directly or view it on GitHubhttps://github.com/datasalt/pangool/issues/25#issuecomment-12546490.

Iván de Prado CEO & Co-founder www.datasalt.com

pereferrera commented 11 years ago

This is now solved on trunk, can you git pull and see if it works? Thanks for reporting.

posa88 commented 11 years ago

good, I will try it tomorrow morning.

posa88 commented 11 years ago

It works fine now~thanks!