nix-community / nix-direnv

A fast, persistent use_nix/use_flake implementation for direnv [maintainer=@Mic92 / @bbenne10]
MIT License
1.59k stars 98 forks source link

Infinite Looping when entering a flake that has a FHSenv. #496

Open Karidus-423 opened 1 month ago

Karidus-423 commented 1 month ago

Flake.nix

{
  description = "Beehive Project Python Environment Flake";

  inputs = {
    nixpkgs.url = "github:nixos/nixpkgs?ref=nixos-unstable";
    flake-utils.url = "github:numtide/flake-utils";
  };

  outputs = { self, nixpkgs, flake-utils, ... }:
  flake-utils.lib.eachDefaultSystem (system:
  let
    name = "Bee-Env";
    system = "x86_64-linux";
    pkgs = nixpkgs.legacyPackages.${system};
  in
  {
    devShells.default = (pkgs.buildFHSEnv {
    inherit name;

    targetPkgs = pkgs: (with pkgs; [
        python3
        python311Packages.pip
        nodePackages_latest.pyright
    ]);
    runScript = "zsh";
  })
  .env;
  });
}

.envrc

use flake

This causes the following loop of about 35 loads.

image

This can be reproduced by having the flake.nix that I have and the .envrc with use flake. This only happens with direnv as when manually loading doesn't cause any issues.

File Tree image

bbenne10 commented 1 month ago

I don't know that this causes problems, but you use flake_utils.lib.eachDefaultSystem only to override system later?

I suspect that the problem is in calling zsh as the runScript, as that will reload your shell as if you've cd'd into this directory and that will retrigger direnv.

I can't test these fixes at the moment as I am on a darwin host and buildFHSEnv seems to require glibc-nogcc which is not available on darwin.

Karidus-423 commented 1 month ago

Sadly, that still happens when I don't use the runScript for zsh. And I also added the flake_utils.lib.eachDefautlSystem in trying to fix the issue. So, it probably is mainly due to the buildFHSEnv as we enter into an isolated filesystem. As, I'm thinking about it there might be no way that once we enter into it for direnv to know that it has been loaded.

bbenne10 commented 1 month ago

Well, if you are overriding system, eachDefaultSystem (system: can't be doing anything. Can you try runScript: bash (or runScript: "${pkgs.bashInteractive}/bin/bash)? If that shell isn't set up to run direnv, we should see that the shell is not relaunched?

I think we can probably provide and document a workaround for this particular problem, but I don't think that this ultimately boils down to a nix-direnv specific issue.

Karidus-423 commented 1 month ago

Tried runScript: "bash" and ${pkgs.bashInteractive}/bin/bash. The problem kept happening.

bbenne10 commented 1 month ago

So replacing runScript with echo "foo" causes evaluation to no longer infinitely recurse. However, I am not sure it's actually applying the environment at all? (Replacing it with echo $PATH causes my current $PATH to be echoed?).

nix print-dev-env . from a directory containing this flake prints out the important bits (and is what we run internally). There is a call to bubblewrap call in there, which gives me pause as to whether we are going to get anything out of this non-trivially...

I'll have to come back in the morning and think about this a bit more.

sweetbbak commented 1 month ago

Im having the same issue. I think it may be a combination of $SHELL being set to zsh and calling zsh in runScript or exec zsh in shellHook. If I make the shell bash and remove these lines, cd-ing into a directory with nix-direnv will evaluate the flake/nix file, but if I run zsh to switch shells it also loops back to evaluating the flake/nix file and jumps back to bash. This happens with dead simple flakes and dead simple nix files. I will follow up if I find more out about it

bbenne10 commented 1 month ago

This happens with dead simple flakes and dead simple nix files.

What does "dead simple flakes" mean here? Notably, is this a problem that shows up independent of calling buildFHSEnv? If so, can you share a flake that's causing you problems? My current best guess is that it is buildFHSEnv that's causing some looping, but if that's not the case, I most certainly want to know that!

bbenne10 commented 1 month ago

I think that the root of this issue is as follows:

This then repeats infinitely. I am not yet sure how to address this.

I will say that using an FHSEnv for just the simple task of getting python with pip is unnecessary. You can use shellHook to do the pip install and python3.withPackages to get python + pip easily. I'd like to figure out if there's a way to make this work, but I suspect that right now the answer is "don't use FHSEnv with nix-dirnev until we maybe fix the issue"...

zeyugao commented 3 weeks ago

This is not an easy task to rid of FHSEnv. Once we involve conda and mamba, the documentation will require using FHSEnv, making it very complex.

bbenne10 commented 3 weeks ago

This gets very close to a thing I have been thinking about outside of nix-direnv for a long while: Python packaging and nix.

Tl;dr: Don't use mamba or conda. Use nix. Make the ecosystem uniform and the problem goes away.

Python's packaging problems are many and varied. I won't even attempt to go over them here, but they're MAYBE starting to get solved with things like poetry and flit, but the problem is that there are TOO MANY third party solutions that are all 80% as good as the nix solutions we already have. Nix isn't perfect - far from it - but it is better in my experience than the ad-hoc solutions it is supplanting.

Conda and mamba are a first pass at this, but they're not as all-encompassing as nix and that makes them worse. If you can package your stuff as nix derivations (and with some effort, you can) you should do that rather than rely on conda/mamba/etc. to do the "external" packaging for you.

That being said: I don't know you, your organization, or your project, so I won't say that that's always possible or preferable than hacking nix into the project. But I can say that I don't know how nix-direnv fits in yet, because the FHSEnv approach spawns a new shell and sanitizes the environment at least partially and then relaunches direnv.

I haven't had the time to look at how to solve this problem yet, and I likely won't for a while. If you'd like to have a look over how to solve this, please feel free - I'm happy to review and merge PRs.

Additionally, I will note that that doesn't look like it requires FHSEnv at all. It uses it, but it doesn't look like it's completely necessary? I think you can do the same with the impure flag and a variable pointing at the repo root?

For instance, in .envrc:

export REPO_ROOT=$PWD
use flake . --impure

In your flake:

# ... initialization
pkgs.mkShell {
  packages = [
    pkgs.micromamba
  ];
  shellHook = ''
    set -e
    eval "$(micromamba shell hook --shell=posix)"
    export MAMBA_ROOT_PREFIX=$"REPO_ROOT"}/.mamba
    if ! test -d $MAMBA_ROOT_PREFIX/envs/my-mamba-environment; then
        micromamba create --yes -q -n my-mamba-environment
    fi
    micromamba activate my-mamba-environment
    micromamba install --yes -f conda-requirements.txt -c conda-forge
    set +e
  '';
};

We do something very similar to this manage a virtualenv in a few projects at my job. It works just fine.

This isn't to say that I don't want to support FHSEnv. I just don't yet know how and given that there are many workarounds given some lateral thinking, I don't really have it as a priority right this moment.

Mic92 commented 3 weeks ago

If you have to use mamba or conda, here is another escape hatch: LD_LIBRARY_PATH. When i quickly have to make a virtualenv working, I use it like this:

mkShell {
  shellHook = ''
    export LD_LIBRARY_PATH=$NIX_LD_LIBRARY_PATH
  '';
}

I populated NIX_LD_LIBRARY_PATH like this: https://github.com/Mic92/dotfiles/blob/f478332ab110985d615a9a8bdc56dd50b53448c4/nixos/modules/nix-ld.nix#L7

Of course this is not great for working in a team... in this case I would create a wrapper around python that sets LD_LIBRARY_PATH just for the python executable.

ierturk commented 3 weeks ago

Also the issue is valid for following simple nix-shell

# shell.nix
{ pkgs ? import <nixpkgs> {} }:
(pkgs.buildFHSEnv {
  name = "test-env";
  targetPkgs = pkgs: (with pkgs; [
  ]);
  multiPkgs = pkgs: (with pkgs; [
  ]);
}).env
# .envrc
use nix

[me@samsung-lt:/Lab/WSs]$ cd nix direnv: loading /Lab/WSs/nix/.envrc direnv: using flake path '/Lab/WSs/nix' does not contain a 'flake.nix', searching up error: could not find a flake.nix file direnv: nix-direnv: Evaluating current devShell failed. Falling back to previous environment! direnv: export +NIX_DIRENV_DID_FALLBACK ~PATH

[me@samsung-lt:/Lab/WSs/nix]$ cd .. direnv: unloading

[me@samsung-lt:/Lab/WSs]$ cd nix direnv: error /Lab/WSs/nix/.envrc is blocked. Run direnv allow to approve its content

$ direnv allow
direnv: loading /Lab/WSs/nix/.envrc
direnv: using nix
direnv: nix-direnv: Renewed cache
direnv: loading /Lab/WSs/nix/.envrc
direnv: using nix
direnv: loading /Lab/WSs/nix/.envrc
direnv: using nix
direnv: ([/nix/store/35klgzald67mkslqb9kkv01gn98zfbza-direnv-2.34.0/bin/direnv export bash]) is taking a while to execute. Use CTRL-C to give up.
direnv: ([/nix/store/35klgzald67mkslqb9kkv01gn98zfbza-direnv-2.34.0/bin/direnv export bash]) is taking a while to execute. Use CTRL-C to give up.
direnv: loading /Lab/WSs/nix/.envrc
direnv: using nix
direnv: ([/nix/store/35klgzald67mkslqb9kkv01gn98zfbza-direnv-2.34.0/bin/direnv export bash]) is taking a while to execute. Use CTRL-C to give up.
direnv: loading /Lab/WSs/nix/.envrc
direnv: using nix
direnv: loading /Lab/WSs/nix/.envrc
direnv: using nix
direnv: ([/nix/store/35klgzald67mkslqb9kkv01gn98zfbza-direnv-2.34.0/bin/direnv export bash]) is taking a while to execute. Use CTRL-C to give up.
direnv: loading /Lab/WSs/nix/.envrc
direnv: using nix
direnv: ([/nix/store/35klgzald67mkslqb9kkv01gn98zfbza-direnv-2.34.0/bin/direnv export bash]) is taking a while to execute. Use CTRL-C to give up.
...
bbenne10 commented 3 weeks ago

Yes. The issue is not the python environment, but rather the buildFHSEnv call, as I have tried to make clear.

buildFHSEnv ends up with a bwrap call in the print-dev-env output, which causes the environment to be cleared and a new shell to be spawned over and over again. I remain unsure of a solution for nix-direnv regardless of the tech wrapped inside the buildFHSEnv call.

Mic92 commented 3 weeks ago

Not just nix-direnv but direnv in general is not compatible with solutions that require subshells.

nick4f42 commented 3 weeks ago

In my case, I only wanted the FHSEnv for a single program, so I was able to do this:

# .envrc
use nix
# shell.nix
{
  pkgs ? import <nixpkgs> { },
}:

let
  julia-fhs = pkgs.buildFHSEnv {
    name = "julia";
    runScript = "${pkgs.julia-bin}/bin/julia";
  };
in
pkgs.mkShell { packages = [ julia-fhs ]; }