NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
18.05k stars 14.04k forks source link

General problems with environment variable wrappers #60260

Open ambrop72 opened 5 years ago

ambrop72 commented 5 years ago

The use of wrapper scripts to set or change environment variables before running the real executable has become very widespread in nixpkgs. While they are important for making programs run out-of-the-box, wrappers have some general problems. Some of these could be fixed but some are perhaps unfixable. I'm writing this to generate some discussion to see if we could solve some of these issues or come up with an alternative solution.

  1. Environment variables are propagated to child processes. I believe this is the most problematic thing. The intent of the wrapper is to make the wrapped program run correctly, and the fact that child processes see the adjustment is unintended leakage of information/configuration. Any child process started by the wrapped program that also needs environment changes will generally have its own wrapper. This leakage could cause another application to run incorrectly or differently than if it was started "cleanly".

  2. Wrappers which add to list-type variables (typically colon-separated) will add the value unconditionally without checking if it is already in the list.

To observe (1) and (2) in action, open Konsole and run echo "$XDG_DATA_DIRS". Then type konsole in the same terminal and run the echo in the new terminal. You will see that the variable contains two entries for <konsole-store-path>/share. You can keep repeating this to get XDG_DATA_DIRS that grows without bounds.

  1. Wrappers obscure the process name. Just look at a process list and find many .something-wrapped processes.

  2. People forget to make wrappers and/or don't know which variables need to be set. For example, probably most GTK apps need to have GDK_PIXBUF_MODULE_FILE set to point to some file in the librsvg package. It is far from obvious to a random person packaging an app that this is necessary (see #60254 and #42562).

  3. Wrappers break software that opens the executable and expects it to be an ELF binary. I have encountered this with Wine but I have not yet filed an issue (it results in very suble but widespread breakage). But this is uncommon.

ambrop72 commented 5 years ago

I have an alternative design in mind, where we get rid of wrappers, instead packages declare in passthru what they need to work, and the environment ensures that variables are set globally (or just in the context of desktop sessions) such that declared requirements of all packages included in the environment are met. In particular, nixos/modules/programs/environment.nix would decide which variables to set to what based on the programs included in environment.systemPackages. Similar would need to be done to ensure variables from user environments are set.

But it does have the problem that when a profile is switched (nixos-rebuild switch or nix-env), associated variable changes would not affect running sessions. Maybe there is a way to address this but it probably cannot be simple (like, instead of setting variables in sessions, add wrappers to programs to source export files from all applicable profiles - but that brings back some of the general problems with wrappers).

matthewbauer commented 5 years ago

I have an alternative design in mind, where we get rid of wrappers, instead packages declare in passthru what they need to work, and the environment ensures that variables are set globally (or just in the context of desktop sessions) such that declared requirements of all packages included in the environment are met. In particular, nixos/modules/programs/environment.nix would decide which variables to set to what based on the programs included in environment.systemPackages. Similar would need to be done to ensure variables from user environments are set.

Remember we also need to support users outside of NixOS. They could be running the binary with any environment.

ambrop72 commented 5 years ago

Remember we also need to support users outside of NixOS. They could be running the binary with any environment.

Yeah, that is indeed one aspect of the problem. Wrappers do directly address this, but at the expense of creating a mess with environment variables, including when running in NixOS where this may be unneeded if suitable stuff is included in the profile. I'm wondering if we could, in the long term, drop support for running GUI apps "directly", outside of an environment, and provide a practical solution to encapsulate them in an environment. Like, the environment could be the one that generates correct wrappers, not the package itself.

ambrop72 commented 5 years ago

the environment could be the one that generates correct wrappers, not the package itself.

This would solve "3. Wrappers obscure the process name". It is also more flexible because all packages in the environment (such as whether you are running a specific desktop) could contribute to the variables set in other wrappers, e.g. including the KDE desktop in the environment could ensure that all other apps get support for SVG icons. Or special configuration options (e.g. support these GVFS plugins) could adjust some variable in all wrappers.

ambrop72 commented 5 years ago

By the way, "Wrappers obscure the process name" can already be solved relatively easily, by putting the executable into a subdirectory, e.g. something -> .wrapped/something.

matthewbauer commented 5 years ago

By the way, "Wrappers obscure the process name" can already be solved relatively easily, by putting the executable into a subdirectory, e.g. something -> .wrapped/something.

Might be worth trying, but there could be some issues with relative paths, symlinks, etc. To be clear, I 100% agree with the problems you are describing, I just know from experience that there's not a good alternative.

I think we could look into using a C-based wrapper instead to prevent the issues with script. This is a little bit less readable, but could solve 3 & 5. This is also an issue on macOS with python shebangs.

ambrop72 commented 5 years ago

I think we could look into using a C-based wrapper instead to prevent the issues with script. This is a little bit less readable, but could solve 3 & 5. This is also an issue on macOS with python shebangs.

Not sure about this. How does it help with (3)? And it also does not solve (5) in the sense that the program opening the executable will open the wrong executable (the wrapper) not the one it wanted to.

However, whatever we do with wrappers, we do not address "(1) Environment variables are propagated to child processes". We could look into more invasive approaches. For example, the wrapper could pass all such "variables" via a single specific environment variable (e.g. NIX_WRAPPER_VARS), and the binary would, very early at startup, read this environment variable, save its contents, and unset the original environment variable (so it will not be passed to child processes). The application code would then use a specific library function like get_nix_wrapper_var to read it (the code would be patched to call that instead of getenv), or getenv in the stdlib could be patched to make it look like these are passed as environment variables.

ambrop72 commented 5 years ago

How does it help with (3)?

Ah you probably mean the ability to specify argv[0] in execve.

The idea for a C-based wrapper seems good. It could also help solving (2) (not adding to list if already present) more elegantly. I will now try to implement this, probably with C++, to be used as an #! interpreter where the wrapper executable reads the remaining content that specifies the wrapped executable and which environment adjustments are needed.

ambrop72 commented 5 years ago

60229 is a particular recent manifestation of some of the problems outlined here - when the program restarts itself it calls the wrapper so the wrapper does all the adjustments after they were already done. In that specific case the problem is actually adding already-added command line arguments, but it must also be adding to environment variables GI_TYPELIB_PATH, LUA_PATH and LUA_CPATH redundantly.

I see that the current shell wrappers already set argv[0] to the proper program name (of the wrapper) trying to hide the existence of the wrapper, but that apparently does not affect the process name in the process list (I am guessing that is based on the name of the actual executable).

Maybe fiddling with argv[0] is a bad idea (considering this Awesome problem) and the wrapper should just call the executable with its nix store path (whether or not the wrapper is part of the package or made by the environment)? Is there a reason wrappers are currently setting argv[0]? Is it only so that the command line is preserved, if someone looks at it in the process list?

ambrop72 commented 5 years ago

Another problem with the current wrappers is that since it uses bash it cannot work with setuid. This would be solved by the use of a C-based wrapper.

I've made some progress making a C++ wrapper, but it cannot be a drop-in replacement because it's not bash. For example:

In general it seems like like can be done, but some cases would need to be handled differently, possibly by patching the application code or by sticking with the bash wrapper.

ambrop72 commented 5 years ago

Here's a prototype of a C++ based wrapper. There is one executable which is both the interpreter for wrappers and a tool to generate the wrapper file. The latter is so because escaping strings correctly in bash would be a pain. It works for a few test case packages. Prefix and suffix functionality work such that any previous occurrences of added elements are removed.

I don't think it makes sense to go replacing the current wrappers with this. It would be some work and not much benefit (which is the check for existing occurrences of elements, and perhaps a small performance improvement). Many use cases could not be handled because it doesn't let you run bash code. On the other hand I like this limitation because it enforces some sanity. This wrapper could be a good starting point for a system where buildEnv generates wrappers.

joakim1999 commented 5 years ago

About 5. I have encountered issues with wine due to this on several occasions. The more severe of them is that when you create a 64-bit wine prefix with the package wineWowPackages.full it will create a prefix with ONLY 64-bit dlls. The syswow64 folder is completely empty. This may cause alot of issues and needs to be dealt with.

If you want to see this yourself I have made a nix-shell command that reproduces and logs only 64-bit dlls being copied and the error where wine can't be loaded as an ELF executable: nix-shell -p wineWowPackages.full --run "WINEPREFIX=$HOME/.problematicwine WINEDEBUG=+wineboot,+setupapi wineboot 2>&1 | grep -e fakedlls -e ELF -A 5 -B 5"

FRidh commented 5 years ago

I would like to have a binary that reads a nix-support/wrappers.json file. The file contains a mapping where the keys are paths and the values are arguments. We can then improve our hooks to just update that JSON file during build-time, also improving the situation when dealing with multiple hooks, preventing nested wrappers. https://discourse.nixos.org/t/wrappers-and-hooks-do-not-invoke-wrapprogram-directly/3551

Sample:

{
  "bin/foo": { "executable":  "/nix/store/....python", "arguments": [ "PATH=/nix/store.../bin"] },
}

the problem, as usual, is finding the path to the file, although that one could be given as argument to the wrapper binary.

stale[bot] commented 4 years ago

Thank you for your contributions.

This has been automatically marked as stale because it has had no activity for 180 days.

If this is still important to you, we ask that you leave a comment below. Your comment can be as simple as "still important to me". This lets people see that at least one person still cares about this. Someone will have to do this at most twice a year if there is no other activity.

Here are suggestions that might help resolve this more quickly:

  1. Search for maintainers and people that previously touched the related code and @ mention them in a comment.
  2. Ask on the NixOS Discourse.
  3. Ask on the #nixos channel on irc.freenode.net.
nixos-discourse commented 4 years ago

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/include-custom-environment-variable-into-wrapper/8229/2

majkrzak commented 3 years ago

I would like to share my point of view which is a bit different than those presented here. What if instead of focusing on improving wrappers we try to get rid of the problem they are solving. For example wrapQtAppsHook won't be needed if Qt library itself will know where to search for modules by default. I think symilar solution can be applied to other cases like the one with GDK_PIXBUF_MODULE_FILE.

samueldr commented 3 years ago

@majkrzak I personally would agree, environmental pollution as a means to fix those goes against hermetic package management.

See https://github.com/NixOS/nixpkgs/pull/44047 for a past attempt at this for Qt.

But note that maintainers are, it seems, generally not interested in patching upstream software to deal with this. (I personally think it's wrong, since it ends up that every leaf package has to know and deal with intricacies of the packaging of the dependency.)

FRidh commented 3 years ago

I would like to share my point of view which is a bit different than those presented here. What if instead of focusing on improving wrappers we try to get rid of the problem they are solving. For example wrapQtAppsHook won't be needed if Qt library itself will know where to search for modules by default. I think symilar solution can be applied to other cases like the one with GDK_PIXBUF_MODULE_FILE.

Clearly, that is preferable. Except it then becomes a language/framework/package specific task, and still something is needed to put that together.

timor commented 3 years ago

I would like to share my point of view which is a bit different than those presented here. What if instead of focusing on improving wrappers we try to get rid of the problem they are solving. For example wrapQtAppsHook won't be needed if Qt library itself will know where to search for modules by default. I think symilar solution can be applied to other cases like the one with GDK_PIXBUF_MODULE_FILE.

I was bitten by this recently (Emacs propagated an incompatible GDK_PIXBUF_MODULE_FILE into child shells). How would you get rid of the GDK_PIXBUF_MODULE_FILE environment variable?

samueldr commented 3 years ago

Another fresh issue for me: /proc/$PID/exe will point to the wrapped path!

This is causing issues with at the very least KWin, but it is likely other KDE or Plasma components assume the path will be the actual binary.

This is a situation where even tampering with argv0 won't help.

See #116549 for a fix for KWin. I wouldn't be surprised if there were other similar issues, in KDE frameworks, Plasma, and elsewhere.

majkrzak commented 3 years ago

How would you get rid of the GDK_PIXBUF_MODULE_FILE environment variable?

@timor most likely by injecting it to: https://gitlab.gnome.org/GNOME/gdk-pixbuf/-/blob/master/gdk-pixbuf/queryloaders.c#L307 and https://gitlab.gnome.org/GNOME/gdk-pixbuf/-/blob/master/gdk-pixbuf/gdk-pixbuf-io.c#L376 or by playing with GDK_PIXBUF_LIBDIR

tobiasBora commented 3 years ago

Small question: concerning problem 1, can we always know when the environment variable should be propagated to the child process or not? I guess sometimes we do want to propagate them to the child process (for example in a shell).

I'm also hijacking into this issue because of this bug, which adds a 6th issue with wrappers:

  1. exec -a "$0" myscript does not propagate $0 when myscript is a script (i.e. not binary). This applies to bash scripts, python scripts, and more.

This is a problem because some programs (carla and NSM in my case) relies on the fact that $0 is the name of the executable => when NSM tries to restart carla later (NSM is a session manager that can save states of programs), it tries to run .carla-wrapped instead of carla, which fails because .carla-wrapped is not available in the PATH. We can solve that issue for scripts by asking to the wrapper script to directly concatenate the environment-changing part and the script, but it means that we need one wrapper per language (python does not change environment variables like bash), and it does not solve anyway the points 1-5.

If we can get rid of wrappers, it would be really great (not sure how it is realistic). If it's not possible, I'm thinking (but I'm not an expert at all, so it may be complete non-sense), that there may be another solution: instead of setting the environment variables in a wrapper script, one may try to associate to any executable mysoftware a text file, say .mysoftware_ENV (or we can maybe be even simpler, by setting one such file per software, located maybe in /nix/store/zzz-mysoftware-1.0/ENV). This file could look like:

MYKEY=myvalue

Then, the idea would be to change slightly GLIBC in order to take care of that file: when a new file (say /nix/store/zzz-mysoftware/bin/mysoftware) is getting executed, the interpreter (say /nix/store/aaa-glibc-bbb/lib/ld-linux-x86_64.so.2) would pick the file /nix/store/zzz-mysoftware/bin/.mysoftware_ENV and/or /nix/store/zzz-mysoftware-1.0/ENV, load all the environment variables defined there, and finally run the rest of the program.

Advantages: By writting our custom-GLIBC correctly, I guess we can basically solve problems 1,2,3,4 and 6:

Drawbacks: Now, we have the problem that one need to modify and keep a fork of GLIBC. I'm not sure how realistic this is (maybe nixos already have its own GLIBC?), and if it is not too dangerous (in term of security: I guess our GLIBC would be less tested than the official one). It may also be hard to maintain, I'm not sure how stable GLIBC is. Moreover, since this part would become essential in Nix, one really need to make sure that enough people can maintain that fork in case the person that wrote the code is gone. It also have the problem that it may be a bit harder to debug since people outside nix may not be aware of our custom GLIBC.

On the other side, maintaining wrappers is also quite a nightmare, so maybe spending a bit more time in GLIBC, but less time in the wrappers could make sense. And if we are ready to maintain a different GLIBC, maybe we could even avoid the need to patch proprietary executables (except for changing the interpreter path), by directly asking to GLIBC to find the libraries in the good place depending on our ENV file?

Mitigate the drawbacks: I'm also thinking that instead of writing a fork of GLIBC, one may start by just writing a simple custom ld-linux interpreter: this interpreter could just setup the environment variable accordingly, and then call the original ld-linux interpreter. I would say it is enough to solve 3,4,5,6 (and maybe 2). But this may not be enough to solve problem 1 since execv will propagate all the environment to the child. But it could be good enough to start.

Do you think it can work? Is it realistic, or just a nightmare to maintain? (either the fork, or the custom ld-linux)

-- EDIT -- Actually, I've a doubt: if the file is statistically linked, is does not even have any interpreter I guess... In that specific case I don't see how we can avoid an additional wrapper script. The advantage here however is that we can just use exec -a and it should work (but it won't solve point 1). Or it may be possible to turn a static elf into a dynamic elf to avoid wrappers ^^'

tobiasBora commented 3 years ago

Actually, to wrap scripts (and solve problem 6, and 3 for scripts, we still need to consider scripts and binaries differently), I found a trick to avoid the need to patch glibc and is not too hard to implement I guess: instead of externalizing the script, I externalize the wrapper by defining a custom (script) interpreter whose role is to first configure the environment variable (the language of this interpreter can be arbitrary, I just use bash), and then call the original interpreter (it could be any interpreter: bash, python...). This solution is interesting because it does not depend on the script used, and it does not change the value of $0 for scripts. If it's simpler, we can either ask the wrapper to take the same environment definition for a given package, or create a per-executable environment variable as we are doing now.

my_script.py:

#!./my_python_wrapper
# The above line can be replaced with any file like /nix/store/xxx-mysoftware/wrappers/my_python_wrapper
import sys
import os

print("Here is my name {}, the variable MYENV is {}".format(sys.argv[0], os.environ["MYENV"]))

my_python_wrapper:

#!/usr/bin/env bash
# Setup environment variables...
export MYENV="WIN :-)"
# Call the original wrapper, and forward all arguments, /usr/bin/env will be replaced in practice with the actual /nix/store/xxx-python3/bin/python interpreter.
exec /usr/bin/env python3 "$@"

Demo:

$ ./my_script.py 
Here is my name ./my_script.py, the variable MYENV is WIN :-)

PS: note that you do need to put an absolute path for the wrapper if you want to be able to call the executable from multiples folders... So it means that it will be hard in practice to change the location of the executable. However, in practice I guess it is not very important since anyway many path are hardcoded in the /nix/store folder.

stale[bot] commented 3 years ago

I marked this as stale due to inactivity. → More info

tobiasBora commented 2 years ago

Still interested.

ncfavier commented 2 years ago

See #124556 and #150079

tobiasBora commented 2 years ago

Thanks. But will it only fix a MacOs-specific issue, or will this script actually think it is not wrapped? Like what happens if a wrapped script writes Hello $0? Will it display the name of the original script or the wrapped version? As it, I don't think it fixes the issues with scripts in general, but I may be wrong. In any case, I created this issue https://github.com/NixOS/nixpkgs/issues/150841 that we can close if #124556 solves it.

tobiasBora commented 2 years ago

I checked and the recent PR does not solve the issue I mentioned.

nixos-discourse commented 1 year ago

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/terminal-emulator-leaks-environment-variables-to-shell/33673/11