awslabs / aws-glue-libs

AWS Glue Libraries are additions and enhancements to Spark for ETL operations.
Other
641 stars 303 forks source link

PermissionError: [Errno 13] Permission denied: '/home/glue_user/.jupyter/migrated' #131

Open benymahajan opened 2 years ago

benymahajan commented 2 years ago

Hi All, I spin up my Glue docker container using below command:   _docker run --platform linux/amd64  -it -p 8888:8888 -p 4040:4040 -e DISABLE_SSL="true" -v ~/.aws:/root/.aws:ro --name glue_jupyter amazon/aws-glue-libs:glue_libs_3.0.0_image_01 /home/glue_user/jupyter/jupyterstart.sh

image

Everytime i stop my container, i am not able to restart it because it gives a permission error as mentioned in the subject. I always lose my work on local as i have to create a new container from the image everytime i stop my container, please HELP!!!

srakrn commented 2 years ago

My suggestion will be

benymahajan commented 2 years ago

@srakrn i am not sure how above would work /home/glue_user exists on the docker image i am running and not on my local. Any Suggestions?

Radhika-s-r commented 2 years ago

https://medium.com/@radhi9214/getting-started-with-a-glue-spark-environment-locally-with-jupyter-notebook-and-docker-82fe73e9fa13. Here is a simple solution try referring the page

nkhandelwal1 commented 2 years ago

Hi @benymahajan - Did you get any solution of this issue. I am also facing the same issue. Below is the command which I am using the run container: docker run -itd -p 8888:8888 -p 4040:4040 amazon/aws-glue-libs:glue_libs_3.0.0_image_01 /home/glue_user/jupyter/jupyter_start.sh

Thanks in advance!

srakrn commented 2 years ago

@benymahajan Yes, /home/glue_user exists, and what I am asking to be mounted is only the particular subdirectories of it--/home/glue_user/workspace/jupyter_workspace/ and /home/glue_user/.jupyter. This should resolve this problem (at least on Windows) while retaining the contents of /home/glue_user.

LXZE commented 2 years ago

After digging into this problem for couple hours, I found the solution for @benymahajan Ps. My setup is on WSL2 ubuntu-18.04.5.

The problem is that a process from glue_user inside the docker container is unabled to write a file called migrated in the /home/glue_user/.jupyter Also, I've found that the user is unabled to do anything on all of mounted directories inside the container.

To fix this, after you created the directory for mounting, you have to change directory's owner too. using command sudo chown -R 10000:10000 <mounted directory> Noted that 10000 is glue_user's uid inside the container. After doing this, I can stop the glue container and start it again without any crash 😀.

For example, I created the directory .glue_jupyter in my home directory, then the command is sudo chown -R 10000:10000 ~/.glue_jupyter. then running the container with -v $HOME/.glue_jupyter:/home/glue_user/.jupyter as @srakrn suggested should be fine.

matheusbsilva commented 1 year ago

Adding to the solutions above I created a custom Dockerfile to solve the issue for me:

FROM amazon/aws-glue-libs:glue_libs_3.0.0_image_01

ARG USER_ID=1000

USER root
RUN usermod -u $USER_ID glue_user
RUN chown -R glue_user /home/glue_user && find / -user 10000 -not -path "/proc/*" -exec chown -h glue_user {} \;

USER glue_user

I changed the default uid of glue_user to the same uid of my host user, and updated the permission of all files that belong to default user inside the container. With that and the mapping of .jupyter directory, all permission issues were solved for me. Here the docker-compose that I'm using:

version: '3.7'

services:
  glue:
    build: .
    volumes:
      - ~/.aws:/home/glue_user/.aws:ro
      - ./src:/home/glue_user/workspace/jupyter_workspace
      - ./.jupyter:/home/glue_user/.jupyter
    environment:
      DISABLE_SSL: true
    ports:
      - 4040:4040
      - 18080:18080
      - 8998:8998
      - 8888:8888
    command: /home/glue_user/jupyter/jupyter_start.sh