TRI-ML / dgp

ML Dataset Governance Policy for Autonomous Vehicle Datasets
https://tri-ml.github.io/dgp/
MIT License
94 stars 62 forks source link

Pull fails after commiting inside docker container. #137

Closed nehalmamgain closed 1 year ago

nehalmamgain commented 1 year ago

Bug

Difficult to fetch from the repository after running git commit.

To Reproduce

Steps to reproduce the behavior:

When following Getting Started, presumably at make setup-linters , pre-commit run --all-files, git commit inside the docker, permissions under .git change for the following folders from user to root

-rw-r--r--  1 root  root     73 Dec  7 11:46 COMMIT_EDITMSG
-rw-r--r--  1 root  root      0 Dec  7 12:34 FETCH_HEAD
-rw-r--r--  1 root  root     23 Dec  7 12:28 HEAD
-rw-r--r--  1 root  root    322 Dec  7 12:30 config
-rw-r--r--  1 root  root  39355 Dec  7 12:28 index

This prevents git operations outside the docker like

~/dgp$ git pull
error: cannot open .git/FETCH_HEAD: Permission denied

and inside the docker like

/home/dgp# git pull
Bad owner or permissions on /root/.ssh/config
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

For shared development devices where users do not have root permissions, it is impossible to change the permissions affected by the container.

Expected behavior

Being able to pull either from inside or outside the container. With above constraint (no root permissions), both workflows are blocked unless setting up repo from scratch again. The best way would probably be for above folder permissions to not change to root in the first place šŸ¤”

Additional context

tk-woven commented 1 year ago

Thanks for reporting this!

I just inspected the Docker configuration. make docker-start-interactive drops you into a Docker container as root. Any command you run that creates files will have the behavior you describe on your host. Looking at what the command actually executes:

$ make docker-start-interactive 
docker run \
  -it \
  --rm \
  --shm-size=1G \
  -e AWS_DEFAULT_REGION \
  -e AWS_ACCESS_KEY_ID \
  -e AWS_SECRET_ACCESS_KEY \
  -e DISPLAY=:1 \
  -v /home/tk-woven/dgp:/home/dgp \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -v ~/.ssh:/root/.ssh \
  -v ~/.aws:/root/.aws \
  -v /tmp/.X11-unix/X0:/tmp/.X11-unix/X0 \
  --net=host \
  --ipc=host

I don't see any mapped files that look like the pre-commit cache. So, pre-commit should only be placing files inside the context of your container. That the error you see is in particular is with permissions under .git, I suspect this is related to running git operations only. This will affect your host as the codebase is bind-mounted from your host into the container.

As for the git issues inside the container, git does not seem to be provided in the image fetched by docker pull ghcr.io/tri-ml/dgp:master. I'm guessing you installed it locally.

Bad owner or permissions on /root/.ssh/config

Since you are operating as root, I believe what is happening here is that git is refusing to use your SSH identity because your active user is root but the uid/gid on your ssh files is brought in from your host environment and is not literally root.

tk-woven commented 1 year ago

BTW I've never actually used the Docker container for development myself, only the virtual environment. I'm guessing the folks who set up this workflow only intend for users to make code changes inside the container and not perform git operations. We'd still hit permissions issues if people created even source files inside the container, though.

nehalmamgain commented 1 year ago

I'm guessing you installed it locally.

Right, the GETTING STARTED also provides instructions on this

To set up linters wherever you plan on using git from (either on your host or inside the container), run

# If in Docker, run the following; otherwise skip this block.
dgp$ apt update
dgp$ apt install -y git
dgp$ git config --global --add safe.directory /home/dgp

# Set up the linters.
dgp$ make setup-linters

BTW I've never actually used the Docker container for development myself, only the virtual environment. I'm guessing the folks who set up this workflow only intend for users to make code changes inside the container and not perform git operations.

Hm understood. So I should run pre-commit separately inside container and always commit with --no-verify outside container...

nehalmamgain commented 1 year ago

Btw from the docs, it seems the workflow allows for users to use git inside the container (just not on devices where user doesn't have root permissions I guess...)

To set up linters wherever you plan on using git from (either on your host or inside the container)

tk-woven commented 1 year ago

Thanks; maintainers should probably change the docs to avoid guiding users into the situation you found yourself in. I have to delegate this at this time, though.

tk-woven commented 1 year ago

Maybe an opportunity for @ryo-takahashi-1206 and company to support?

ryo-takahashi-1206 commented 1 year ago

Thank you for sharing. I will do the followings within 1 week:

tk-woven commented 1 year ago

Thanks so much Takahashi-san; SGTM :)

nehalmamgain commented 1 year ago

Thank you Tyler and Takahashi-san šŸ™‡ā€ā™€ļø

nehalmamgain commented 1 year ago

Linking the fix for the issue: https://github.com/TRI-ML/dgp/pull/138

ryo-takahashi-1206 commented 1 year ago

@nehalmamgain Please close this issue if you become happy with https://github.com/TRI-ML/dgp/pull/138 šŸ™‡

nehalmamgain commented 1 year ago

Thank you for the fix!