Open andyneff opened 2 years ago
Expansion on the /venv changes:
The pipenv cache is not needed in the final stage, as pipenv install should be happening in the pipenv stage The only caching that may happen in the final stages is pip cache at run time, and there's no need to precopy cache over for that case.
We can stop COPY
ing the venv dir to the final stage if:
just build
(well, at the end of it) we run alpine and rsync
(with --chown
) the /venv
folder from the pipenv_cache stage to a volume
/venv
during just runRUN --mount type=cache
will work here, because there would be no way to reuse this at pipenv lock
time pip
command onlyjust pipenv lock
, but not if just shell
and then lock. No ideas on this currently (making the lock file read only sounds like a pain, and offers no explanation. I think it's best to just hope for the best here)compose build
. This would be an small image like alpine image that rsyncs the contents over. This will hopefully be just as slow/fast the first time, and much faster on subsequent runs when the content of the virtualenv does not changeAdd default gpu flags, maybe a GPU question
# NVIDIA GPU devices and capabilities
: ${${JUST_PROJECT_PREFIX}_NVIDIA_VISIBLE_DEVICES="all"}
: ${${JUST_PROJECT_PREFIX}_NVIDIA_DRIVER_CAPABILITIES="gpu"}
Abandoned Pip venv idea:
What if we never built the venv in the docker image at all? And just built it like we do when we just compile
source code?
The only way I could see this working is to add this as part of just sync
. Which led us to the following problems:
just sync
, because all the packages would have to be downloaded by sync
and any python packages that need compiling, would have to be compiled again, even if nothing changed in the requirements files.docker image prune
in the process.Bullet iii does affect the original plan to use RUN --mount type=cache
mentioned in the previous post. Mainly, we should be more selective on what part of the pip cache we docker build/cache, although the http download cache should be safe to cache
Over all, it seems like continuing to use docker build
to make our venvs seems like out best bet.
What would be the benefit(s) of not building the venv in the docker image? If we wanted to avoid the "fake_package" ugliness, could we somehow break the requirements.in
(in the case of pip-tools) into two parts, one containing the "normal" dependencies that can be installed during the docker image build, and another containing just the editable dependencies that get added later once the source code volume is available?
I'm not sure if that's possible or not - if not maybe some other method of breaking up the venv dependencies into "normal" and editable dependencies is possible?
We shouldn't do that, because we loose the ability to track the dependencies of our editables, which is something we really want.
I would have to test this, but one possibility would be to have a script that:
requirements.txt
file, and remove all editables, and sync against that. This is pretty simple and should work for nearly all situations.just sync
)requirements.txt
file, but that would be the worse case scenarioAre there any advantages of that solution over fake_package? It doesn't sound any cleaner to me.
The advantages if I can pull it off should be that:
new_just
adds to the Dockerfile
and you don't need to update it for every editable package.Our tests with pip-tools just showed:
We can indeed use the requirements.txt
files to install only partial environment
pip install -c requirements.txt numpy torch pip-tools
And we will probably be able to add the --require-hashes
once we start using hashes
This will give us the flexibility to easily install "install dependencies" for non-PEP-517 projects and pip-tools
itself so we can track and use pip-tools
.
It's time add brand new 5 year old docker technology to
new_just
, buildkit.docker-compose build
withdocker buildx bake
# syntax
/venv/cache
to a local linked dir to speed up sync time for devs