cachix / devenv

Fast, Declarative, Reproducible, and Composable Developer Environments
https://devenv.sh
Apache License 2.0
3.56k stars 259 forks source link

Recreate Python venv and run pip install less often #912

Closed mcdonc closed 4 months ago

mcdonc commented 4 months ago

This does not take into account use of -r in requirements.txt and probably should.

This patch is against the python-rewrite branch.

Supersedes #904

mcdonc commented 4 months ago

To handle detecting changes in requirements specified via -r, we need to flatten requirements.

Looks like some flattening work was done in https://github.com/cachix/devenv/issues/889

domenkozar commented 4 months ago

Did the Python tests fail before? :thinking:

mcdonc commented 4 months ago

Did the Python tests fail before? 🤔

I'm glad there are so many tests to fail! I haven't looked at each and every one, but most seem unrelated, and similar to failings like the recent Rust PR.

Dumb question. How can I run a subset of the tests locally?

mcdonc commented 4 months ago

The failure https://github.com/cachix/devenv/actions/runs/7312821636/job/19926548496?pr=912 is the issue https://github.com/cachix/devenv/issues/909

mcdonc commented 4 months ago

To handle detecting changes in requirements specified via -r, we need to flatten requirements.

Rather than using a Python lib to do this, we could parse the root requirements file and recursively concatenate all of the files referenced by -r into a single requirements.txt and recursively concatenate all the files referenced by -c into a single constraints.txt; this would negate the need to have a lib that was able to deal with envvars or -e or whatnot.

mcdonc commented 4 months ago

To handle detecting changes in requirements specified via -r, we need to flatten requirements.

Rather than using a Python lib to do this, we could parse the root requirements file and recursively concatenate all of the files referenced by -r into a single requirements.txt and recursively concatenate all the files referenced by -c into a single constraints.txt; this would negate the need to have a lib that was able to deal with envvars or -e or whatnot.

Like this: https://gist.github.com/mcdonc/2653bbfbd33032a12464c8f64c183aff

bobvanderlinden commented 4 months ago

Did the Python tests fail before? 🤔

I suspect that the poetry test is also failing in the python rewrite PR:

https://github.com/cachix/devenv/actions/runs/7312821636/job/19926548605?pr=912#step:8:396

+ poetry run python -c 'import os; print(os.environ['\''LD_LIBRARY_PATH'\''])'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "<frozen os>", line 679, in __getitem__
KeyError: 'LD_LIBRARY_PATH'
mcdonc commented 4 months ago

From what I understand, the goals of the PR should already have been met:

Don't recreate Python venv unless Nix wrapper has changed.

Using the condition:

if [ "$(${readlink} "$VENV_PATH"/bin/python)" != "$(${readlink} ${package.interpreter}/bin/python)" ]

This test will always return false currently. See https://github.com/cachix/devenv/issues/920 .

Don't run pip install -r /nix/store/xxx-requirements.txt unless requirements have changed.

Using the condition:

[ "$(${readlink} "$VENV_PATH"/requirements.txt)" != "$(${readlink} ${if requirements != null then requirements else "$VENV_PATH/requirements.txt"})" ]

In what situation was this not working correctly? Do you have some reproducible steps to see the behaviour failing with the version on main?

There is never a "requirements.txt" within the venv. There might be one within $DEVENV_ROOT, but it's not guaranteed to be named "requirements.txt". We also shouldn't need to recreate the venv when the requirements change, we should just need to rerun pip install of the requirements.

The current behavior for either means the venv is recreated on every invocation of devenv shell. The reproducible step is to run it once, exit the shell, then run it again without any changes and see the venv get recreated. The PR is against python-rewrite, so the behavior applies only to it.

We have a few bugs interacting:

mcdonc commented 4 months ago

Did the Python tests fail before? 🤔

I suspect that the poetry test is also failing in the python rewrite PR:

https://github.com/cachix/devenv/actions/runs/7312821636/job/19926548605?pr=912#step:8:396

+ poetry run python -c 'import os; print(os.environ['\''LD_LIBRARY_PATH'\''])'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "<frozen os>", line 679, in __getitem__
KeyError: 'LD_LIBRARY_PATH'

AFAICT, there is no guarantee that LD_LIBRARY_PATH will exist within the shell?

bobvanderlinden commented 4 months ago

Did the Python tests fail before? 🤔

I suspect that the poetry test is also failing in the python rewrite PR: https://github.com/cachix/devenv/actions/runs/7312821636/job/19926548605?pr=912#step:8:396

+ poetry run python -c 'import os; print(os.environ['\''LD_LIBRARY_PATH'\''])'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "<frozen os>", line 679, in __getitem__
KeyError: 'LD_LIBRARY_PATH'

AFAICT, there is no guarantee that LD_LIBRARY_PATH will exist within the shell?

With the current way .so libraries are linked, it is needed for some of the python packages to operate correctly. There probably is going something wrong in the python-rewrite branch that doesn't set the envvar when using poetry.

mcdonc commented 4 months ago

I'm giving up on the -r/-c requirements.txt flattening due to impurity restrictions exercised by Nix. I've put all I discovered about it in #889. This means the PR stands as-is delta the venv-recreate on interpreter-checking code, which can be simplified when #920 is fixed.

mcdonc commented 4 months ago

With the current way .so libraries are linked, it is needed for some of the python packages to operate correctly. There probably is going something wrong in the python-rewrite branch that doesn't set the envvar when using poetry.

We have some longstanding bugs/assumptions and multiple branches interacting, I think, which makes this complex and hard to describe, but I don't think that the python-poetry example should be relying on LD_LIBRARY_PATH, even though it happens to work on main.

Assumptions:

Bugs:

It looks like the main branch does indeed set an LD_LIBRARY_PATH in some cases, but I think it's incidental. And on the python-rewrite branch, it might exist within the Python process, but if all goes right, there is at least one scenario where it won't.

Rationale:

The python-rewrite branch, unlike main, creates a wrapper script (https://github.com/cachix/devenv/blob/python-rewrite/src/modules/languages/python.nix#L13) for the Python defined in languages.python.version, and has two new additional options: languages.python.libraries and languages.python.manylinux, which are meant to work in concert with the wrapper. This is what sets LD_LIBRARY_PATH for the Python process. This doesn't work currently but if #920 is fixed it will (and it must be, if we want to use a wrapper, and the wrapper is kinda the raison-d'etre of the branch AFACIT).

You can of course set env.LD_LIBRARY_PATH yourself and ignore languages.python.libraries. This will indeed put an LD_LIBRARY_PATH in the shell environment, but this is the only time one would necessarily be present. But if there isn't anything in languages.python.libraries and languages.python.manylinux is false, there won't necessarily be an LD_LIBRARY_PATH in Python's os.environ. There will not necessarily be an LD_LIBRARY_PATH in the shell env because all library path setting is presumed to be handled by the Python wrapper binary.

So I think the right fix is to change the test, because I think it's relying on an artifact of how main is working (which I don't actually understand, but I am pretty confident that the shell, even on main, might not have an LD_LIBRARY_PATH in it unless you set one explcitly).

mcdonc commented 4 months ago

@bobvanderlinden maybe you can help me understand the test in examples/python-poetry/.test.sh (or really any of the example tests)... is its output checked by some validator or is the test just that all of these commands complete?

bobvanderlinden commented 4 months ago

This test will always return false currently. See https://github.com/cachix/devenv/issues/920 .

Hmm, this is probably a recent thing that broke it. It makes sense to fix it, but the current behaviour:

if [ "$(${readlink} "$VENV_PATH"/bin/python)" != "$(${readlink} ${package.interpreter}/bin/python)" ]

does still seem like it should work. Just that there is a bug in nixpkgs that results in a wrong "$VENV_PATH"/bin/python. A workaround in devenv makes sense while the issue is being resolved in nixpkgs.

There is never a "requirements.txt" within the venv. There might be one within $DEVENV_ROOT, but it's not guaranteed to be named "requirements.txt". We also shouldn't need to recreate the venv when the requirements change, we should just need to rerun pip install of the requirements.

Ah sorry, that line is indeed wrong. It should indeed be fixed.

Nothing whatsoever (except the user) sets an LD_LIBRARY_PATH for Python or the shell on the python-rewrite branch.

The main branch sets LD_LIBRARY_PATH for the entire shell environment. python-rewrite sets it inside the python wrapper. See https://github.com/cachix/devenv/pull/745/files#diff-f5bb89f776eda1a758db67c6c75b19ff1665b2f10998d17d1d98479835fc9e2eR16-R26. So if languages.python.libraries is set (which should be done for some of the python libraries that are being tested), LD_LIBRARY_PATH should be available inside python.

The .test.sh files merely need to succeed. The output does not matter, but the output can help quite a bit to get insight why a test fails. It was probably the reason to output LD_LIBRARY_PATH inside the test inside of python. It's likely that removing the line that is failing right now (printing LD_LIBRARY_PATH) will expose that loading zlib and pjsua will fail (because they aren't available in LD_LIBRARY_PATH). The test likely needs to move those packages from packages to languages.python.packages.

I'm not entirely sure why these tests aren't running on the python-rewrite branch. I'll try to create a PR that fixes the poetry test for the python-rewrite branch.

mcdonc commented 4 months ago

Nothing whatsoever (except the user) sets an LD_LIBRARY_PATH for Python or the shell on the python-rewrite branch.

The main branch sets LD_LIBRARY_PATH for the entire shell environment. python-rewrite sets it inside the python wrapper. See https://github.com/cachix/devenv/pull/745/files#diff-f5bb89f776eda1a758db67c6c75b19ff1665b2f10998d17d1d98479835fc9e2eR16-R26. So if languages.python.libraries is set (which should be done for some of the python libraries that are being tested), LD_LIBRARY_PATH should be available inside python.

920 is preventing it from working currently, but that's my understanding as well.

The .test.sh files merely need to succeed. The output does not matter, but the output can help quite a bit to get insight why a test fails. It was probably the reason to output LD_LIBRARY_PATH inside the test inside of python. It's likely that removing the line that is failing right now (printing LD_LIBRARY_PATH) will expose that loading zlib and pjsua will fail (because they aren't available in LD_LIBRARY_PATH). The test likely needs to move those packages from packages to languages.python.packages.

I'm not entirely sure why these tests aren't running on the python-rewrite branch. I'll try to create a PR that fixes the poetry test for the python-rewrite branch.

Cool!