microbiomedata / nmdc-schema

National Microbiome Data Collaborative (NMDC) unified data model
https://microbiomedata.github.io/nmdc-schema/
Creative Commons Zero v1.0 Universal
27 stars 8 forks source link

Sharing `.venv` directory between Docker host and guest can lead to conflicts #1744

Open turbomam opened 7 months ago

turbomam commented 7 months ago

@pkalita-lbl you have encouraged me to build my Poetry .venvs in my project directories. There's a BBOP best practice that illustrates how to do that globally with a poetry command.

@eecavanna thanks for building the Docker environment for nmdc-schema. The best timing for creating the docker .venv isn't clear to me. I have disabled it (and the mkdocs server launch) from the Dockerfile and currently consider it a manual step.

I think our most common use case is MacOS host (Intel/AMD64) and Linux Docker container. If the .venv is in the project directory and the whole project directory is shared with the container, then the Linux container will try to use the MacOS Poetry environment and complain that it seems broken.

I would like to create a Poetry environment that the host and container can both use, or prevent the two environemnts from "seeing" one-another's environments.

pkalita-lbl commented 7 months ago

I would like to create a Poetry environment that the host and container can both use

I think this could only work if every dependency is a source-only (i.e. no prebuilt binary distributions), and I'm almost certain that's not going to be the case. So I wouldn't even go down that road. Why not just do a poetry install in the Docker container and let it have its own virtual env? Yeah sure you'd have to rebuild the Docker image if you change dependencies, but that's pretty standard practice right?

turbomam commented 7 months ago

thanks!

due to poetry config virtualenvs.in-project true, poetry install in the host creates a .env/ directory

the container doesn't have that Poetry setting, so it creates its virtual environment in the user's .cache/pypoetry/virtualenvs

However, if the container sees the host's .venv/, then it tries to use that environment and complains that it's broken. It sees .venv/ because of this volume mapping in docker-compose.yaml: "./:/nmdc-schema", where nmdc-schema is the container's working directory.

eecavanna commented 7 months ago

OOO today.

Suggestion: In the docker-compose.yml file, use a named or anonymous volume to omit the virtual environment directory from what gets mapped between the host and guest.

I can help fix this one I'm back in.

If you make changes to the container-based development environment (in the GitHub repo), please update the documentation so it remains consistent.

pkalita-lbl commented 7 months ago

Looks like some good tips here on how to exclude a subfolder from a volume: https://stackoverflow.com/questions/29181032/add-a-volume-to-docker-but-exclude-a-sub-folder

eecavanna commented 6 months ago

@turbomam and I discussed this today. I'll remove @pkalita-lbl and @turbomam as assignees, leaving myself as the owner. My plan is to reconfigure the Docker stuff so that the virtual environment folder is not shared between the host and the guest/container.