geotrellis / geodocker-cluster

[NOT MAINTAINED] GeoDocker Cluster is a Docker environment with Apache Accumulo and Apache Spark environment.
https://github.com/geodocker/geodocker
Apache License 2.0
27 stars 18 forks source link

add hadoop user to base accumulo image #32

Closed echeipesh closed 8 years ago

echeipesh commented 8 years ago

What I am finding is that when working on HDFS instance that has permissions enabled, like EMR, it's important to be the correct user. Currently if you were to try to accumulo init on EMR you would get something like:

Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=root, access=WRITE, inode="/accumulo/version/7":hdfs:hadoop:drwxr-xr-x

Weirdly the user in question is not the owner of the container process but the user inside the container. Adding this user so one can:

docker run --rm -u hadoop ....

solves the issue.

Is there a more graceful way to handle this case? While hadoop is the only user, its probably not the only possible option in the wild.

pomadchin commented 8 years ago

That's definitely a nice solution (workaround) to a problem. Already were some cases with that, and that was one of motivation points to have hadoop user inside dev images.

pomadchin commented 8 years ago

Thanks, nice catch!