conan-io / conan

Conan - The open-source C and C++ package manager
https://conan.io
MIT License
8.27k stars 981 forks source link

[bug] subsystem.Popen() should be called with a clean env #16118

Closed paulharris closed 6 months ago

paulharris commented 6 months ago

Describe the bug

OS: Windows Compiler: msvc-19 2022 Conan version: 2.2.3

Conan profile:

Profile host:
[settings]
arch=x86_64
build_type=Debug
compiler=msvc
compiler.cppstd=20
compiler.runtime=dynamic
compiler.runtime_type=Debug
compiler.version=193
os=Windows
[conf]
tools.cmake.cmaketoolchain:generator=Ninja

Profile build:
[settings]
arch=x86_64
build_type=Release
compiler=msvc
compiler.cppstd=20
compiler.runtime=dynamic
compiler.runtime_type=Release
compiler.version=193
os=Windows
[conf]
tools.cmake.cmaketoolchain:generator=Ninja

There are many problems caused by unclean environments during the build step. I get a variety of different behaviours depending on if I use CMD, or MSVC-CMD, or my own msys (git-sdk + pacman), or my msys was started after loading the MSVC vcvars environment.

This is especially visible when running autotools builds, which uses the msys environment during the build.

Example: ICU would not build in some of those environments: https://github.com/conan-io/conan-center-index/issues/23679

Moving straight on from that, a nicer example is libvpx, which loudly trips over the !ExitCode environment variable (making this easier to reason about).

When running conan from within my msvc+msys shell, the build ends up running in an environment that has many layers:

Crazy, but true. This is what (part of) the environment looks like at this point:

COMMONPROGRAMFILES(X86)=C:\Program Files (x86)\Common Files
!C:=C:\Program Files (x86)\Microsoft Visual Studio\Installer
!ExitCode=00000000
!EXITCODE=00000001
PROGRAMFILES(X86)=C:\Program Files (x86)
SHELL=/usr/bin/bash

Finally, conan can execute autotools make, which executes and then fails with:

     3>C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Microsoft\VC\v170\Microsoft.CppCommon.targets(75
       5,5): error MSB6001: Invalid command line switch for "CL.exe". System.ArgumentException: Item has already been a
       dded. Key in dictionary: '!EXITCODE'  Key being added: '!ExitCode' [M:\cocache4\cache\b\libvp414895aa790c4\b\bui
       ld-debug\vpxrc.vcxproj]

Note that this is a really nice example, but plenty of other packages will change behaviour (sometimes non-fatally) with unclean environments. For example, libvpx WILL build with a slightly different environment, but it won't produce debug libraries as requested! So the package installs with errors, only headers and no libs.

NOTE: To debug this, I added a breakpoint() to libvpx's recipe, before autotools.make(), then hand-edited the conan/bat files in the cache, adding a bunch of printouts of the environment.

How to reproduce it

memsharded commented 6 months ago

Hi @paulharris

I am not sure this is a bug, this is how Conan and recipes in ConanCenter are designed to work:

If this is not what you want, there are some ways to inhibit some of this behavior:

paulharris commented 6 months ago

I think the old conan1 had to be run in a vcvars shell, or at least that was how I was running it. I'm happy to avoid running in a vcvars shell going forward.

It seems to work fine from within msys shell. To be clear, I run in there so I can use the other msys tools. I don't want to use my msys as part of the build chain.

So I want conan2 to create its own little bubble of a tightly controlled and repeatable environment each time it runs. I thought that was what conanbuild.bat was for...

Otherwise, is everyone expected to run a CI build farm or build in docker or something? I haven't seen anything in the docs about how you should ensure your PATH and environment is clean before you run conan.

How are other people doing this? How exactly do you spin up a clean environment that doesn't involve building on a VM or container? And if this is what we should be doing as standard, shouldn't it be in the tutorials?

valgur commented 6 months ago

This would be a good opt-in conan conf option to have, I think.

memsharded commented 6 months ago

I think the old conan1 had to be run in a vcvars shell, or at least that was how I was running it. I'm happy to avoid running in a vcvars shell going forward.

I am a Windows user, and I didn't have to run in a vcvars shell in Conan 1 either. Most modern Conan generators in Conan 1 have been generating/calling VCVars to create the conan_vcvars environment activation when the build system is not enough (CMake with VS generators doesn't need it, but it does when using Ninja). In any case Conan can provide the activator when needed.

So I want conan2 to create its own little bubble of a tightly controlled and repeatable environment each time it runs. I thought that was what conanbuild.bat was for...

conanbuild.bat injects the environment from tool_requires for recipes at build time. But it is not a bubble, it doesn't provide full isolation of the system. It adds/appends/prepends/applies/unset the tool_requires definitions, and also those coming from profiles [buildenv].

It is the recipes or the profiles that can alter the environment the way they want, they can even unset defined environment variables. But it is not the responsibility of Conan to provide isolation from the system, because actually a lot of users rely on the system environment, and use it together with Conan. An extremely common situation is Conan adding tool_requires("cmake/..."), but having the compiler defined in the system, like with CC/CXX and other environment variables. If Conan would remove all environment this wouldn't work, and it will force users to always define everything in Conan profiles, and that is not the intention or the best default either.

How are other people doing this? How exactly do you spin up a clean environment that doesn't involve building on a VM or container? And if this is what we should be doing as standard, shouldn't it be in the tutorials?

No, it shouldn't be necessarily the default or standard Conan behavior. C++ devs have been working without isolated "virtualenvs" since always, and they are used to often rely, at least partially on the environment.

memsharded commented 6 months ago

This would be a good opt-in conan conf option to have, I think.

I am not sure this works as a "feature" to opt-in, and it sounds to me like the typical rabbit hole, where users need more and more configurability because the implemented behavior is not good as there are too many different use cases, like: "Hey, why are you unsetting the variable "Program Files", I need it for my recipes to work, can you add a new conf that allows to exclude the "Program Files" from the unsetting?"

I think having a tool-require that defines a buildenv_info that unsets everything you want to unset would be just a very few lines and if you tool_requires it before the other tool-requires, I think it can work.

paulharris commented 6 months ago

Firstly, thanks for engaging in this debate. I think it is important, at least if all it does is educate me.

I didn't run this through a chatgpt politeness filter, I'm just trying to lay out the arguments for and against so please just read it as the Arguments for the Opposed. I look forward to your rebuttal.


The tool-require approach feels like a hack for only me, when I'm sure other people have this same problem. And how can I guarantee it will run before anything else? I literally want it to run before any of the conan buildenv bat files run.


An extremely common situation is Conan adding tool_requires("cmake/..."), but having the compiler defined in the system, like with CC/CXX and other environment variables.

For this specific example, cmake should be defined in tool_requires (as you say), and CC/CXX should be defined by the profile... this is [conf] tools.build:compiler_executables

Any other environ variables should be in the profile, that is the job of the profile. If there are environ vars required for just one recipe, there is a place in the recipe for that.

And when I need the environ to be different for a release build, then that is why there is more than one profile.


I think the whole point of conan builds is to have repeatable builds. There was even a blog post about deterministic builds. https://blog.conan.io/2019/09/02/Deterministic-builds-with-C-C++.html At no point does it raise the problem of the shell environment that conan is running in. I had always assumed that when conan runs, it sets up the whole environment.

IMO it would be a lot more reliable and predictable for conan to start with a blank environment, and add things IN explicitly. I say this as a recipe author too. I don't want everything to work on my computer, and then build differently on the CI or someone else's computer. This is exactly what was happening to me with libvpx. Proof-of-conan built fine, but it wouldn't build on my computer, due to extra stuff in the environment.

I don't even use libvpx directly, but now I've had to spend time hacking on that recipe understanding what is going on, when it was something in the environment unrelated to conan or the recipe.

I also want repeatability. I don't want my builds to change just because this week I installed some bit of software and it added something into the environment. That way leads to insanity. Often the change in output won't become apparent until later (eg small change in the library behaviour), so it is hard to tell what (and when) it changed.


The PR of mine only has whitelisted items because I found nothing would work at all in a Popen call with a completely blank env. So I added what looked like the most basic environment elements that you would get in a blank machine.

And it worked brilliantly. If something extra is needed, I can add it in the profile or recipe.

I would go one step further and remove the added path to the python executable (my second commit in the PR).

# FROM MY PR #
        # Also get the path to conan's python, as some packages assume access to python
        # eg dav1d
        # This will also give access to ninja.exe and whatever else is installed in the devenv
        python_path = os.path.dirname(sys.executable)
        newenv['PATH'] = f'C:\\WINDOWS\\system32;C:\\WINDOWS;{python_path}'

Note that "giving access to ninja" is a negative, not a positive. The Ninja exe should be specified by conan, not whatever is in my PATH today.

The dav1d recipe requires python, and it got it out of the environment's PATH. I see that as a recipe bug. If a recipe requires python, then, WHICH python does it require? That should be a build-requirement for the recipe, right? Like cmake.


I didn't make the same changes for Linux (yet) because I'm currently in Windows mode. But I feel we should also clear out the environment for Linux and Mac too.

paulharris commented 6 months ago

I should mention that I don't think the PR is finished. eg running Git.clone() fails without the shell's env, but I would like to see a more controlled environment.

paulharris commented 6 months ago

In one of my recipes, I execute git and some other commands in the source() method. It uses self.run() and Git.clone(), which (according to the docs) will execute conanbuild.sh as well as any tool-requires scripts required.

However, source() is executed before there is a generator folder. That means we can't define the environment for source() to run commands, and we can't use tool-requires within source().

eg, I was not able to run envvars.save_script() within source().

paulharris commented 6 months ago

The with env.vars(self).apply(): pattern also won't work with this PR. We would have to clear the environment at the start, before running conan. I don't think this can be done via tool_requires.

paulharris commented 6 months ago

I've adjusted my approach and moved the environment-cleaner to the start of the conan process, otherwise EnvVars.apply() doesn't work - as @memsharded would've guessed.

To make it work with EnvVars.apply would likely cause too much change in the conan code, so now I'm looking at clearing the env while calling conan.

paulharris commented 6 months ago

I'm not sure this would deal with all the issues, ie shouldn't conanbuild.bat be clearing stuff out so that cmake-executions and ninja-builds will also get a clean environment?

memsharded commented 6 months ago

Thanks very much for your feedback.

I am closing https://github.com/conan-io/conan/pull/16119, based on the comments in https://github.com/conan-io/conan/pull/16119#pullrequestreview-2031291716. This is not considered a bug, but this is totally the intended and expected behavior of Conan. If users want to enforce a "curated" set of environment variables, this is a responsibility that is outside of Conan scope, and specific to every user, so this is not something that Conan can implement by default.

paulharris commented 6 months ago

I think the biggest problem was !ExitCode, which is later mangled into !EXITCODE by python's environment dict. When a second !ExitCode is added due to a second msys layer, MSVC's tools freak out.

I understand why it can't be merged, I'll see if I can find a workaround along the lines of what you've suggested above.

memsharded commented 6 months ago

I think the biggest problem was !ExitCode, which is later mangled into !EXITCODE by python's environment dict. When a second !ExitCode is added due to a second msys layer, MSVC's tools freak out.

There was a recent fix for Conan 2.3, that avoids some internal uppercasing that Conan was doing in Windows for all env-vars, check https://github.com/conan-io/conan/pull/16122, or try running from the develop2 branch to see if this improves.

paulharris commented 6 months ago

Closed in favour of #16268