conan-io / conan

Conan - The open-source C and C++ package manager
https://conan.io
MIT License
8.12k stars 964 forks source link

[feature] Ability to propagate a profile from a recipe for packaged toolchains and bootstrapping #13533

Open samuel-emrys opened 1 year ago

samuel-emrys commented 1 year ago

What is your suggestion?

Hi Team,

There are a few things that I would like to achieve with conan:

I've experimented with this a little with the existing gcc recipe, a glibc recipe (that I'm yet to upstream), with the lapack recipe that's currently in draft format (https://github.com/conan-io/conan-center-index/pull/15556), and a PR that's up to model the libraries that gcc exposes (https://github.com/conan-io/conan-center-index/pull/15128).

From this experimentation, I've identified a few requirements that we would have in this situation:

  1. Compiler libraries need to be exposed as a runtime requirement when when the compiler is used in the build context. There is a need to be able to link against the libraries distributed with compiler packages. This requirement needs to be propagated to consumers. An example of this is that using gfortran from the gcc recipe requires any consnumer of the library built with gfortran to link against libgfortran.{a,so}. This need can be mitigated where a shared lib is being created, then it can statically link libgfortran.a and consumers don't need to be aware of the libgfortran requirement, but this won't always be the case. This can be observed in the lapack PR above.
  2. Compiler packages need to be able to propagate a profile matching their own characteristics to libraries being built with them. If I specify that I want to use gcc/11.3.0 to build a package, but a pre-built variant of this doesn't exist, it should look to bootstrap itself by looking to previous versions of gcc. Lets say it finds a prebuilt package for gcc/10.2.0 - this compiler and version should be reflected in the metadata of the gcc/11.3.0 package now, rather than what would happen at the moment which is whatever the user defined profile is (which may even vary from gcc/11.3.0!). Likewise, the package I'm building with gcc/11.3.0 should have gcc 11.3.0 in its package metadata, not whatever the user defined profile is. Similarly, I would expect a package compiled with gfortran to have settings.info.compiler=gfortran rather than settings.info.compiler=gcc in its package metadata. This is a simplistic representation of some of the things that might change. It's also useful to know the libc that the gcc package has been built against and the requirement that it will impart on libraries that it builds. This could be a step towards more meaningfully controlling library ABI. This would also be useful for building cross-compilers.
  3. Profiles need a placeholder template in which a recipe can be specified as the compiler, rather than relying on what's installed on the system. To facilitate (2) above, allowing the user to specify in their profile that they want metadata to be derived from a compiler package would be useful to avoid a multi-step environment set up. To illustrate, it would be good to be able to set a profile in a way similar to:
[settings]
arch=x86_64
os=Linux
compiler={{ package:gcc/11.3.0 }}
build_type=Release
# requirement on gcc/11.3.0 package to populate: 
# settings.compiler
# settings.compiler.version
# settings.compiler.libc
# settings.compiler.libcxx
# conf.tools.build:compiler_executables

In the above case, if the conf.tools.build:compiler_executables=gfortran, then compiler={{ package:gcc/11.3.0 }} would evaluate to compiler=gfortran, compiler.version=11.3.0. I can see that this might be difficult to infer which executable is used here, so another alternative might be to just ensure that each compiler has its own recipe/package.

I think what's being highlighted here is a need for a new package type - it would not be desirable to have any package be able to override the user specified profile or act this way. Perhaps the definition of package_type could be extended to include a use case such as this rather than just traits propagation. I.e., package_type=compiler (noting that there is still a runtime requirement that compiler packages are likely to impart)

There's a lot of text here and I'm sure I've been unclear about some things. Please ask any questions that would help clarify the use case and proposed interface.

Relates to:

Have you read the CONTRIBUTING guide?

memsharded commented 1 year ago

Hi @samuel-emrys

Thanks for your detailed explanation and suggestions.

My first thought about this proposal is that it is not a new package-type. It is probably not even a regular package with a qualifier. It seems this would be a completely new concept, flow and infrastructure.

I think so because if it would be a normal package type, that means that it could be required by a normal library package mypkg with self.requires("specialtoolchain/version"). But at this moment in time the mypkg is already fully evaluated, with 2 host and build profiles, because it needs to be fully evaluated, as it can easily contain conditionals like:

def requirements(self):
     if self.settings.compiler == "gcc":
            self.requires("...")

So it would be a chicken and egg problem.

This new package concept is something that seems only make sense for profiles and that they belong more to the "setup" problem than to the "dependency" resolution problem. We don't want for this kind of package concept every package in the graph define its own tool_requires, but it is something quite orthogonal to it. tool_requires() in recipes make sense to require something like meson that only a specific package might use, or a specific version of cmake that is not installed in the system.

Please let me know if this makes sense.

This sounds like a very major feature. I am not saying that we cannot approach and consider it, actually the opposite, I have always been interested in this problem. But also, it could mean that it will take time. Stabilization of 2.0, migrating and helping the community to upgrade, and some other very important roadmap things are pending.

samuel-emrys commented 1 year ago

We don't want for this kind of package concept every package in the graph define its own tool_requires, but it is something quite orthogonal to it.

Yes, I agree with this. I've never particularly liked the pattern of having tool_requires("gcc/11.3.0") in regular packages because this ignores the user specified compiler and locks a consumer in to being only built with gcc. It would be much better if compilers were not specified as tool_requires, and were instead specified via the profile.

Having said that, I would expect a compiler recipe to have a tool requires for itself to enable a bootstrap, something akin to:


class GCC(ConanFile):
    name = "gcc"
    version = "11.3.0"

    def build_requirements(self):
        self.tool_requires("gcc/[>=5.0.0]")

    def requirements(self):
        self.requires("glibc/x.y.z")
        self.requires("linux-headers/x.y.z")

tool_requires() in recipes make sense to require something like meson that only a specific package might use, or a specific version of cmake that is not installed in the system.

Yep - 100% agree with your analysis here.

Having said that, it's not easy to completely extricate the compiler from the runtime requirements of the application/library it's building. As mentioned, compiling a library with gfortran will introduce a requirement on libgfortran, which needs to be propagated to consumers of the built-library, so there is some requirement for the libraries that a compiler is using to link to be captured in the dependency tree. This also applies to glibc and libstdc++.

So, the compiler package needs a way of injecting libraries that need to be linked against in the downstream requires, and also providing them. What this currently looks like:


class LapackConan(ConanFile):
    name = "lapack"

    def requirements(self):
        self.requires("gcc/12.2.0")

    def build_requirements(self):
        self.tool_requires("gcc/12.2.0")

   def package_info(self):
        ...
        self.cpp_info.components["blas"].set_property("cmake_target_name", "BLAS::BLAS")
        self.cpp_info.components["blas"].libs = [self.get_lib_name("blas")]
        self.cpp_info.components["blas"].requires = ["gcc::gfortran"]

        self.cpp_info.components["lapack"].set_property("cmake_target_name", "LAPACK::LAPACK")
        self.cpp_info.components["lapack"].libs = [self.get_lib_name("lapack")]
        self.cpp_info.components["lapack"].requires = ["blas", "gcc::gfortran"]

Obviously this could be improved to some degree with specification in the profile, but the same requirement to model the library dependencies exists either way.

But also, it could mean that it will take time

Understood, I just wanted to start a conversation about this use case :)

sykhro commented 1 year ago

Does this mean that, at the moment, it's impossible to use application-type packages in tool_requires that implement a compiler?

If I have

[tool_requires]
!some_compiler*: some_compiler/version

I get this

ERROR: There is a cycle/loop in the graph:
    Initial ancestor: some_compiler/version
    Require: some_compiler/version
    Dependency: some_compiler/version
samuel-emrys commented 1 year ago

It might be possible in constrained situations to coax something functional for compilers, but if it were possible at the moment you'd probably have to suffer incorrect metadata.

The way to do it currently would be to

conan install --requires gcc/11.2.0 --build=missing

And then populate the path to the compiler in your recipe

[settings]
arch=x86_64
build_type=Release
compiler=gcc
compiler.cppstd=20
compiler.libcxx=libstdc++11
compiler.version=11
os=Linux
[conf]
tools.build:compiler_executables={'c': '/home/user/.conan2/p/gcc9f370ca971ddf181/p/bin/gcc', 'cpp': '/home/user/.conan2/p/gcc9f370ca971ddf181/p/bin/g++'}

Then use this recipe to build your packages

memsharded commented 1 year ago

Also @sykhro it is important to be in 2.0, or if you are in 1.X, then use the 2 profiles -p:h host-profile -p:b build-profile. Then, it is important to put the profile [tool_requires] mostly in the "host" profile to avoid issues, and a tool_require for a specific package, like to bootstrap itself with a previous version, can be added in the recipe itself in the build_requirements() method.

kammce commented 1 year ago

I've found great success with doing this:

[settings]
build_type=MinSizeRel
compiler=gcc
compiler.cppstd=20
compiler.libcxx=libstdc++
compiler.version=12.2
os=baremetal

[tool_requires]
arm-gnu-toolchain/12.2.1

[conf]
{# conf stuff goes here #}

My arm-gnu-toolchain tool package adds its compiler to the compiler list, removing the need for the user to do this themselves. It also propagates to all down stream, dependencies of the application, recompiling them as needed.

samuel-emrys commented 8 months ago

@memsharded we're starting to see some maturity and consolidation of features in conan 2.x - do you see scope for closer consideration of this to be given some effort soon?

memsharded commented 8 months ago

Hi @samuel-emrys

I am afraid this is not yet into consideration, our backlog is already full of tons of other higher priorities, starting with new CI for ConanCenter, to things like workspaces that were removed in 2.0, but the demand is still very high.

samuel-emrys commented 8 months ago

Is there anything I can do to mature this idea from an implementation perspective while your team is out of resources? If I were to come up with a proof of concept for this, are you able to provide suggestions on where this would best be injected into the codebase for consideration?

boldbyteboss commented 1 month ago

We are just beginning a journey to migrate a very large family of products to a new build environment based around Conan. And we're facing similar questions about managing our toolchain as described @samuel-emrys described in the original suggestion here.

We build products using a "customized" version of GCC--specifically we generally need a newer version of GCC than what is included in many Linux distributions as the default. So we configure and build our own. Being newer than the underlying system defaults means that our binaries need both link against the GCC runtime libraries (e.g. libgcc_s and libstdc++) that are in our custom GCC build and these libraries also need to be present in order to run our binaries.

We've been studying the Conan examples for packaging a cross-compiler toolchain (e.g. https://docs.conan.io/2/examples/cross_build/toolchain_packages.html ) which seems to address many of the issues of making a different compiler available. But the example does not help with the runtime issues similar to (1) in @samuel-emrys 's original description. Treating our compiler as a cross-compiler and using tool_requires with the host profile results in our binaries being correctly compiled, but unit tests and test packages do not work because e.g. the correct libstdc++ does not automatically wind up on LD_LIBRARY_PATH.

The analysis provided by @memsharded here https://github.com/conan-io/conan/issues/13533#issuecomment-1483796679 largely squares with how we would like to manage our toolchains. We want to specify our custom compiler once in a profile and then have that propagate across all of our product code and yield the required runtime dependencies. In theory this would then let us migrate to a newer version of our compiler by updating just the host profile and sitting back as everything cascades through.

We're sufficiently motivated to get to a world like this that we may be willing to help contribute with the right guidance.