Finally handle wrappers correctly to make them invisible to applications

NixOS / nixpkgs

Nix Packages collection & NixOS

MIT License

17.4k stars 13.62k forks source link

Finally handle wrappers correctly to make them invisible to applications #150841

Open tobiasBora opened 2 years ago

tobiasBora commented 2 years ago

Describe the problem

In Nix, wrappers are required to properly configure paths and can cause dirty issues https://github.com/NixOS/nixpkgs/issues/60260. In this thread, I focus on one issue of wrappers.

So far, wrapping is done using exec -a "$0" application-unwrapped. When wrapping an executable (i.e. not a script), I think it works nicely as the program cannot really detect that it is inside a wrapper (if the -a is forgotten, it can indeed cause troubles as I just experimented it https://github.com/NixOS/nixpkgs/issues/60260). However, for scripts the $0 variable is not propagated (this seems to be a quite fundamental issue due to the fact that the interpreter of the scripts does not care about $0): as a consequence, the program thinks that it's name is program-unwrapped. Most of the time it is not a great problem (except that help page may sounds funny with unwrapped paths), but sometimes it can cause more troubles. This is the case notably when the program tries to use $0 to start a new instance of himself (at runtime, or to respawn later, why not in a cron or as a new application...). This is done in carla https://github.com/NixOS/nixpkgs/issues/117781 but also in chromium https://github.com/NixOS/nixpkgs/issues/150826 (chromium is a binary, so it's not too much of an issue...) or in Awesome window manager https://github.com/NixOS/nixpkgs/issues/60229. It requires additional patches, from inside Nixpkgs that may be hard to maintain later and cause additional debugging time from the maintainer.

Steps To Reproduce

Steps to reproduce the behavior:

Create a file myscript.sh:

#!/usr/bin/env bash
echo "My name is $0, and MYENV is $MYENV"

the file derivation.nix:

{ stdenv, makeWrapper }:
stdenv.mkDerivation rec {
  name = "-${version}";
  version = "";

  src = ./.;

  nativeBuildInputs = [ makeWrapper ];

  installPhase = ''
    mkdir -p $out/bin
    cp myscript.sh $out/bin
    chmod +x $out/bin/myscript.sh
    patchShebangs $out/bin/myscript.sh
    wrapProgram $out/bin/myscript.sh --set MYENV "WIN :)"
  '';
}

the file default.nix:

{ pkgs ? import <nixpkgs> {} }:
pkgs.callPackage ./derivation.nix {}

and run:

$ nix-build
$ ./result/bin/myscript.sh 
My name is /nix/store/yc5nf2cdbjzn35n6vqlsmbbi0f5vhymf--/bin/.myscript.sh-wrapped, and MYENV is WIN :)

Expected behavior

I'd expect to see:

My name is ./result/myscript.sh, and MYENV is WIN :)

Proposed solution

A solution would be not to add a wrapper replacing the script, but to put the wrapper code directly in the interpreter of the script as I proposed it here:

{ stdenv, runtimeShell, makeWrapper, writeShellScriptBin }:
let
  # TODO: A maybe cleaner solution would be directly copy the first line of the file, remove the #! and take it as interpreter.
  # In case a patchshebang is required, we can maybe do the patchshebang before the wrapping so that the new interpreter.
  # directly takes the appropriate patchshebang.
  # For this example we know we are running a bash shell, so we just use runtimeShell directly.
  # (for python scripts, replace runtimeShell with something like ${python3}/bin/python3 instead)
  myWrapper = writeShellScriptBin "myWrapper" ''
    # Setup environment variables...
    export MYENV="WIN :-)"
    # Call the original wrapper, and forward all arguments.
    exec ${runtimeShell} "$@"
  '';
in
stdenv.mkDerivation rec {
  name = "-${version}";
  version = "";

  src = ./.;

  nativeBuildInputs = [ makeWrapper ];

  installPhase = ''
    mkdir -p $out/bin
    cp myscript.sh $out/bin
    chmod +x $out/bin/myscript.sh
    # TODO: find a more elegant solution to replace the shebang. (see discussion above comment)
    substituteInPlace $out/bin/myscript.sh --replace '#!/usr/bin/env bash' '#!${myWrapper}/bin/myWrapper'
  '';
}

Result:

My name is ./result/myscript.sh, and MYENV is WIN :-)

Artturin commented 2 years ago

https://github.com/NixOS/nixpkgs/pull/124556 probably fixed this please test

tobiasBora commented 2 years ago

Thanks, but no it does not fixes it for the reasons I mentioned above (I even tried with the latest PR in https://github.com/NixOS/nixpkgs/pull/150079).

For reproducibility, here is my default.nix:

{ pkgs ? import (builtins.fetchTarball {url = "https://github.com/bergkvist/nixpkgs/archive/refs/heads/darwin-binary-wrapper-fixes.tar.gz";}) {} }:
pkgs.callPackage ./derivation.nix {}

my derivation.nix:

{ stdenv, makeBinaryWrapper }:
stdenv.mkDerivation rec {
  name = "-${version}";
  version = "";

  src = ./.;

  nativeBuildInputs = [ makeBinaryWrapper ];

  installPhase = ''
    mkdir -p $out/bin
    cp myscript.sh $out/bin
    chmod +x $out/bin/myscript.sh
    patchShebangs $out/bin/myscript.sh
    wrapProgram $out/bin/myscript.sh --set MYENV "WIN :)"
  '';
}

myscript.sh:

#!/usr/bin/env bash
echo "My name is $0, and MYENV is $MYENV"

output:

$ ./result/bin/myscript.sh 
My name is /nix/store/gml55hvxlrwbbsfd49xyrj440w9dnha7--/bin/.myscript.sh-wrapped, and MYENV is WIN :)

and I checked, the binary wrapper contains:

# ------------------------------------------------------------------------------------
# The C-code for this binary wrapper has been generated using the following command:

makeCWrapper /nix/store/gml55hvxlrwbbsfd49xyrj440w9dnha7--/bin/.myscript.sh-wrapped \
    --inherit-argv0 \
    --set 'MYENV' 'WIN :)'

pennae commented 2 years ago

scripts getting the wrong $0 is expected behaviour according to the bash manpage:

If arguments remain after option processing, and neither the -c nor the -s option has been supplied, the first argument is assumed to be the name of a file containing shell commands. If bash is invoked in this fashion, $0 is set to the name of the file, and the positional parameters are set to the remaining arguments.

for bash scripts this may be solvable by having the wrapper set BASH_ARGV0 and then sourcing the target script instead of exec'ing, but that only works for bash and feels excessively fragil.e

it may also be fundamentally unsolvable for scripts with a /usr/bin/env shebang since env doesn't pass on argv[0]. env even fails if its argv[0] isn't env since the coreutils consolidation into a single binary; unsplit coreutils do work but still eat argv[0].

tobiasBora commented 2 years ago

Thanks for your comment, but I'm not sure to get your point.

The manpage only says that $0 should point to a bash script right? Which is what my solution proposes (note that this is not the case of https://github.com/NixOS/nixpkgs/pull/124556 in that case the bash script is in fact an executable and will not run at all when typing bash myscript). Admittedly, running bash myscript with my proposed solution will not provide the appropriate variable environment (except by patching bash, by sourcing the target script as you mentioned, or simply to add the patching code at the beginning of the script), only ./myscript will do. However, I argue that in practice this is not too much of a problem: indeed, when we first start a program (say by clicking on its desktop icon), we typically run the executable. Now, if internally the program calls bash myscript, then the environment variable are already properly configured since it was already called with the appropriate environment before and environment variable propagates nicely. Actually, the proposed solution has some other advantages: it is for instance possible now to write python myscript.py (with the same consequences as the one we have for bash of course) while before this command would have failed.

Of course, the more robust solution would certainly to add the environment variable directly at the beginning of the script. But it means that we need to know for each scripting language a way to populate properly and robustly the environment variable, which may be hard to do reliably. But if I'm not wrong it's how the python wrapper already works right? After, another solution is imaginable: we can apply the robust solution which edit each script when the language is known, and if not fallback to a generic, language agnostic solution.

For /usr/bin/env, I've a few remarks:

first, when packaging, nix anyway tries to get rid of /usr/bin/env as it adds unpurity. This is what I did in my above solution using patchShebangs.
secondly, I'm not sure what you mean when you say that /usr/bin/env does not pass an argv[0], but for sure scripts calling /usr/bin/env can use $0. The above script is an example, and here is another:
```
$ cat test.py
#!/usr/bin/env python3
```

import sys print("Here is my name {}".format(sys.argv[0]))

$ ./test.py Here is my name ./test.py



If I did not get your point, maybe you could propose a precise scenario in which my proposed solution fails when packaging a program, and is a regression compared to the current system?

ncfavier commented 2 years ago

I like this solution, and am actually wondering if it could be extended to also work with binary programs, e.g. by replacing the interpreter segment of ELF files with a wrapped version. The downside is that wrapped binaries would be less discoverable.

pennae commented 2 years ago

@tobiasBora i should have been more clear about that, but you did get the gist: it seems very wrong that running a script plain (./result/bin/myscript.sh) vs running it with explicit interpreter (bash ./result/bin/myscript.sh) should produce different results. like you say this isn't a problem when "it's a bash script" is am implementation detail, but if the wrapped thing is intended to also be sourced as a library things get messy. currently that would fail loudly, with your proposed solution it could half work. admittedly this isn't a very common case, but it is worth thinking about.

re env: please ignore, that was just a red herring i slipped on.

ncfavier commented 2 years ago

I don't think that's a real problem: in the case you're describing, you probably wouldn't want to source the environment variables set by the wrapper. It's already quite common to have ./foo and bash foo behave differently, e.g. when using a nix-shell shebang.

tobiasBora commented 2 years ago

@pennae thanks for the clarification. And indeed, it's something to keep in mind that it might work, or semi-work while today it either works (bash) or crash completely. As far as I know I see only two solutions to this problem:

either prepend to each script a language-dependent code that setups the environment variables (and fallback when the language is not known to a generic method)
or to patch all interpreters bash/python/... to make sure that bash myprogram my arg internally calls ./myprogram my arg if ./myprogram is patched (or, simpler, if it is an executable). It may be easier to apply that solution since this patch would more or less be identical for all interpreters. It would also make the nix-shell shebang usable even with bash myfile. But I'm not sure if this can have other problems. In particular we need to make sure that the interpreter can detect if a patch was already applied (for instance using an environment variable) to avoid infinite loop where bash calls bash that calls bash...

But considering the above discussion it may also never cause any troubles...

tobiasBora commented 2 years ago

@ncfavier For elf files, your solution could work, but I'm not enough experimented to say more. But the advantage may be less clear for binaries since as far as I know exec -a correctly propagates $0. I also thought to patch directly the interpreter for the binaries ld, but there is no interpreter for statistically linked ld.

samueldr commented 2 years ago

Also note that /proc/$PID/exe is sometimes used in a similar fashion to argv[0].

roberth commented 2 years ago

Another failure mode we're not currently experiencing is where the wrapper behavior should not be repeated. Some changes to environment variables would be redundant and potentially harmful if they pile up after multiple self-execs. The same could apply to args, depending on how the wrapped program performs the self-exec.

samueldr commented 2 years ago

If we're airing out all things wrappers, which we probably shouldn't, environmental pollution is antithetical to the encapsulation and hermeticity promises of Nix.

Executing the result of a build from another result of a build may (will) give different results.

E.g. running a KDE app in the Plasma desktop vs. a "non-DE" WM setup.

tobiasBora commented 2 years ago

@samueldr And is any proposed solution better with respect to /proc/$PID/exe? I've the feeling that all solutions (the current one and the proposed one) should work fine with it since exec does not change the pid. Am I missing something?

@roberth The solution I can imagine is that the wrapper can also define a random name for an environment variable (why not based on the hash of its path in /nix/store to enforce determinism) say ABCD, and when the wrapper is called it can check if this variable ABCD is present in the environment, and if not it populates the environment and defines the ABCD variable. Should be like 3 lines to program, the only thing I'm not sure is how we could have access to the hash of the current derivation in nix to define ABCD at build time. But it's true it's something that can cause troubles when the wrapper is called so many times that the size of the PATH goes beyond the authorized limit. Do you have other examples of why pilling up exec may be harmfull?

@samueldr But environmental pollution is sometimes necessary right? For instance I guess that the user may want to define XDG_CONFIG or other things. Also, without environmental pollution it means that it would be impossible to run a program A from a program B if B did not explicitly specified it was allowed to run A. For instance image viewers would be unable to propose "open with" to open the image with the programs available on the computer. Am I missing something?

samueldr commented 2 years ago

But environmental pollution is sometimes necessary right? For instance I guess that the user may want to define XDG_CONFIG or other things.

Sure, if the user defines it in the user-controlled domain (e.g. their .profile) it's not pollution. It's control.

Also, without environmental pollution it means that it would be impossible to run a program A from a program B if B did not explicitly specified it was allowed to run A. For instance image viewers would be unable to propose "open with" to open the image with the programs available on the computer. Am I missing something?

The list of installed applications viewed by a process shouldn't depend on the parent processes' packaging. So I assume that you're probably missing something.

Let's present the issue with environmental pollution with a theoretical GUI/app toolkit [TTK] that needs apps to be wrapped, just like there are some right now in Nixpkgs.

"TTK" apps expect runtime libraries. here image loaders, to be provided with some conventions:

A single predefined "FHS" location /usr/lib/ttk/image-loaders/ (can't work with the nix store)
At locations defined in TTK_PLUGIN_PATH (scheme commonly used in Nixpkgs with wrappers)

Let's package "correctly" application A with the TTKWrapper and wrap it with the plugin path such that it can load additional image types, like .ico or .psd via additional libraries. TTK_PLUGIN_PATH=/nix/store/...ttk-psd/lib/ttk/:/nix/store/...ttk-winimages/lib/ttk/:"$TTK_PLUGIN_PATH".

Great, application A can now open ico and psd files.

Let's package "incorrectly" application B without the TTKWrapper entirely. The application may start correctly even without any environmental manipulation. Some toolkits are resilient about that. This one will only know about the image formats built into TTK. Fine.

Now, assume that application A can open application B as a child process. Application B will inherit A's TTK_PLUGIN_PATH. Under these circumstances, the application B behaviour will differ from when it's used directly; it will be able to load ico and psd files, which it normally wouldn't be able to.

It may not look like a big deal, until you think about further scenarios, like a complete desktop environment made with TTK which seeds the TTK_PLUGIN_PATH with a lot of "plugin paths". This means that an application will behave differently not because of user configuration or user settings, but because of a deficiency in the packaging and encapsulation. These are side-effects.

There are actual issues, other than "eww this is impure". Using a nix-build't or nix-shell program with mismatched TTK_PLUGIN_PATH ABIs will cause pain. These could also be nix-env -installed, or installed through another mechanism from another Nixpkgs commit; e.g. import an application from unstable on your stable system.

This is not the only scenario where it happens. If a programming language runtime uses the environment to resolve dependencies, it's going to fail the same ways (I think python does that?).

Any environment change done in a wrapper to "fix" things is, in my opinion, technical debt. Furthermore, more often than not this technical debt is moved from a central core location (e.g. the toolkit) and pushed onto the unsuspecting packagers.

The solution to this, sadly, will often be to hook a (hopefully small) code change where the toolkit, or language, reads the environment for such needs, and read it from a static well-known package-dependent location. For example, /nix/store/...application-B/nix-support/ttk-plugin-path. This way if application B is executed from application A, the environment is still clean, and only B's own dependencies will be loaded.

samueldr commented 2 years ago

@samueldr And is any proposed solution better with respect to /proc/$PID/exe? I've the feeling that all solutions (the current one and the proposed one) should work fine with it since exec does not change the pid. Am I missing something?

I think so. But AFAICT will scale to binaries; unless I'm missing something, it only concerns itself with #! scripts.

tobiasBora commented 2 years ago

Thanks a lot for the detailed example, it helps a lot. Concerning the problem with applications installed from unstable on a stable system, I also got a lot of issues due to that (that's part of the reason why I moved my whole system to unstable). Would your proposed solution also solve the problems related to errors about different glibc ABI versions? And could it also be extended to solve the issues related with incompatibilities between different versions of the drivers? (sorry if my last two sentences make no sense, but I think to remember that being able to properly handle drivers in a pure world can be quite challenging, and that opengl often suffers from that)

Concerning your proposed solution, I'm not sure to understand who would apply this code change and where ? I'm not sure to understand how it would differ from writing a wrapper that completely forgets all previous environment variables.

Also, I agree now that environment variables must sometimes be forgotten, but sometimes as we mentioned it before I think they should be propagated (like when the user configures it). One particular variable that I don't see how to handle properly is certainly the PATH: would you recommend to propagate it (and risk impurities), to not propagate it at all (and therefore lose important functionalities, like the "open with" functionality of image viewers), or propagate it partially (but then one risk to have both lose in functionalities and impurity). Or maybe we should have different behaviors depending on the environment variables (like environment variables configuring the list of libraries should be cleaned, and the one configuring executable or configuration should be kept untouched)?