conan-io / conan

Conan - The open-source C and C++ package manager
https://conan.io
MIT License
8.2k stars 979 forks source link

Are reproducible builds are possible using conaninfo.txt #2750

Closed progician closed 1 year ago

progician commented 6 years ago

The title should say it all.

Basically, what I can't figure out is when is the conaninfo.txt is really useful in the build process. When there are dependencies referenced by version ranges, the [full_requires] section shows which version was picked, along with the hash.

Version ranges work like this: any time I installing the dependencies of a package it will take the latest version of the package that is within the specified range, it will update to the new package. When you have a botched build, containing a bug, you might want to roll back and see what broke it. The issue is, that with version ranges, I can't reliably see the same build again on my machine, because it already moves on to the latest one.

Of course I can, one by one, lock down the versions of the different packages according to the conaninfo.txt's [full_requires] column, but this can be a really tedious process, and not very reliable. Conan has already all the information required in a machine readable format, so I wonder if there's a way to use that information to lock down the version numbers in any way by using the conaninfo.txt.

Basically, I'm asking if the conaninfo.txt could be used similar to cargo.lock

memsharded commented 6 years ago

Yes, the title is perfect.

Partially related to:

This is a feature that we have been sometime considering, but still haven't found a correct way to do it without breaking. It is high in our priorities. We call it: "the conaninfo is the profile".

Thanks for the feedback and opening the issue (I am a bit surprised we didn't open it yet). This will help to move it forward.

RobvH commented 6 years ago

This was one of the most import features in composer (https://getcomposer.org) and was one oft he two most important features in Yarn (https://yarnpkg.com/) that was quickly brought into NPM, because inability to lock package versions meant builds for QA and Prod could have different dep versions and thus different bugs.

In the case of composer; dependencies are defined in a composer.json and composer update generates a composer.lock file (textual, ultimately containing the hashes of the used dep branches from GitHub). Running composer install will not update the composer.lock file and will install the exact dependency versions in the .lock file. This make reproducible builds so easy.

memsharded commented 6 years ago

@RobvH I totally agree about the importance of this feature. This is one of the reasons we are doing huge changes to the internals of conan, to enable this feature. The thing is that C and C++ are a bit more complex than Php and Npm, so in order to achieve reproducibility it is not enough to keep have a lock file with the versions pinned, but a lot of information about other inputs (like options: shared, fpic, enabled features, etc, etc) is necessary.

It will not be quick, because it will require a few iterations, but we aer on it. Thanks for the feedback!

RobvH commented 6 years ago

@memsharded thank you for such a thoughtful reply. I appreciate the challenges and will do my best to follow along and contribute where I can.

It would be fantastic when starting a new C++ project if it were as easy as conan new someproject Framwork/version@repo && cd someproject && make build. I would expect that to ask me which generator [make, cmake, etc], which test framework [boost, google-test, catch], create the project folder and subfolders, and initialize the base boilerplate.

To add deps, I'd love to be able to type conan require --dev catch2/2.2.2@bincrafters/stable or conan require Poco/1.9.0@pocoproject/stable -- having those look up the package, do the install, and add the correct text under [requires].

memsharded commented 6 years ago

It would be fantastic when starting a new C++ project if it were as easy as conan new someproject Framwork/version@repo && cd someproject && make build. I would expect that to ask me which generator [make, cmake, etc], which test framework [boost, google-test, catch], create the project folder and subfolders, and initialize the base boilerplate.

To add deps, I'd love to be able to type conan require --dev catch2/2.2.2@bincrafters/stable or conan require Poco/1.9.0@pocoproject/stable -- having those look up the package, do the install, and add the correct text under [requires].

Thanks very much for the feedback. The truth is that we have already evaluated this idea. It happens that knowing the conanfile (.txt of .py if creating packages) is quite necessary, so adding another layer through a new syntax in the command line adds more complexity than value. In short, it seems most users find it is simpler to edit the file that learning the commands (note you might need add/delete/update/insert commands too, for each of the sections requires/build_rquires/options/generators...)

Also, the command conan new is already used to generate some example conanfiles (together with some test_package (-t) testing code, or with some example project code (-s) ). The command conan new Hello/0.1 -s -t creates a fully working package and test_package, that can be created with conan create or locally modified with conan install + conan build).

andoks commented 6 years ago

Regarding supporting reproducible builds using a lock-file or similar, Golang recently (in v1.11) got (experimental) official support for versioned dependency management, maybe some inspiration can be found in their documentation about modules

memsharded commented 6 years ago

Thanks @andoks for the pointer. Still the C++ case looks more complex, because of the transitivity of configuration and the way that nodes in the graph an change their configuration (like being shared/static), so it is not enough to lock versions, revisions, but also some information about the nodes configuration should be stored, and thus storing the graph itself becomes necessary. I am working on a proof of concept of this feature at the moment :)

jrickman commented 5 years ago

Hi @memsharded any follow-up on the proof of concept? Very interested in this idea and would be happy to help out.

memsharded commented 5 years ago

Sure, current ongoing efforts are here: https://github.com/memsharded/conan/tree/feature/graph_lock. Some basics seem to start to work, but now I am struggling with how to "lock" the build_requires, because the way they are resolved (recursively) makes it a bit complicated. BTW, I am trying these days to focus on this work and move it forward.

jrickman commented 5 years ago

I now understand what you meant by complexity. Certain settings/options may affect the actual nodes present in the lock graph.

Because package_id is included in the graph, I'm guessing a user will need to maintain a lock file for every profile they build with? I think this will be a big turn-off for some users.

What if the current settings/options are used to make the dependency graph, but the lock file is not rendered with the specific package_ids? That way, the same lock file can be used for multiple configurations as long as those configurations do not alter the dependency tree itself.

memsharded commented 5 years ago

Certain settings/options may affect the actual nodes present in the lock graph.

Exactly, the dependency graph can be (and many times it is) different for each different configuration.

Because package_id is included in the graph, I'm guessing a user will need to maintain a lock file for every profile they build with? I think this will be a big turn-off for some users.

What if the current settings/options are used to make the dependency graph, but the lock file is not rendered with the specific package_ids?

I think this is not a viable approach. Storing the binary IDs right now is not really being used, but a lock file, by definition should render to the same binary IDs. And not only that, but to the same recipe revision and the same binary revision of the given binary ID. The binary IDs will be used to verify that that resolved graph is indeed correct.

That way, the same lock file can be used for multiple configurations as long as those configurations do not alter the dependency tree itself.

That is the key. It happens very often that the settings and/or options actually alter the dependency graph. That use case has to be a first citizen, and thus be the default, otherwise it will be very confusing.

The approach I am considering is to actually capture the profile (the applied one, can be a merge of a profile and command line settings) into the lock file, so this avoids inconsistencies. In that way, all the flows are just replacing the input profiles by lock files, which can (and are by default) be named after the profiles, in all commands: Use conan install --lock-file=mygcc6_release.lock instead conan install --profile=mygcc_release. I find there is a nice symmetry there, that would help users to understand what is happening.

siu commented 5 years ago

Hi, we have being using conan for the last few years with great success, thanks for developing and maintaining it!

The most important issue we have at the moment is related to this topic. We want to have reproducible builds at all times and we have been using conan with specific dependency versions until now. In addition to the library version we add a suffix with the recipe version to the conan recipes. For example the boost version looks like this: 1.67.0-3, where 3 is the recipe version. This allows us to version the recipe itself, which we bump if the code of the recipe has changed or one of its dependencies has changed.

On the other hand this becomes very hard to manage because every time there is a version bump in one of the lower level libraries (in the library or in the recipe) we have to bump the recipe version of all dependencies that use it. This propagates up until the change reaches the application.

To improve this very heavy workflow we would like to use version ranges, but we don't want to give up on build reproducibility! We believe that a lock file like discussed in this topic would enable that: libraries can specify their dependencies using version ranges and we can lock the concrete version being used at the application level (where dependency resolution, options, etc. are evaluated).

To conclude: we are very interested in this feature. Is it being actively developed? Would you accept contributions? Where should we start (assuming we need to get familiar with the code, etc.)?

jrickman commented 5 years ago

@siu What my group ended up doing was develop a separate tool. It reads in the range specifications from a manifest file, runs conan install and generates a conanfile.txt with the resolved versions. Both the range manifest file and the generated conanfile.txt are checked into the repo.

Since some projects use the conan/cmake integration script, the tool supports writing CMake-compatible output instead of conanfile.txt. It also supports python-compatible output.

Simple example:

conandeps.py

manifest(
    'conanfile',
    generator='cmake',
    imports=['bin, *.dll -> ./bin'])
requires('foo/[^1.0.0]@cmake/stable')
requires('bar/[>1.0]@cmake/stable')

This will generate a conanfile.txt with the expected sections for generator, imports, and the specific versions resolved by the range constraints given.

Complex example:

conandeps.py

manifest('conanfile')
group('linux', profile='gcc-5')
group('windows', profile='msvc14')
requires('foo/[^1.0.0]@cmake/stable')
requires('bar/[>1.0]@cmake/stable')
requires('win32_helpers/1.1.0@cmake/stable', group='windows')

This will generate two files named conanfile-linux.txt and conanfile-windows.txt with the appropriate resolved versions.

memsharded commented 5 years ago

Hi @siu , @jrickman

Just FYI, this is a hot topic right now, we are working in this feature. It is a very challenging one, and coming up with something general enough for the wider community is difficult. But at least you know that it is a high priority for us, it might take some time, but we are on it. Keep tuned :)

Thanks very much for your feedback!

progician commented 5 years ago

Hopefully it's not long. Recently I had to remove all version ranges from our system because of problems with reproducing builds. However, without version ranges, ABI backward compatibility is becoming an issue (see my other ticket). Please let us know how is it coming along!

dobrypd commented 5 years ago

Hi @progician

You may be interested in my PR. It may be helpful. https://github.com/conan-io/conan/pull/4513

memsharded commented 1 year ago

conaninfo.txt evolved in 2.0 to support the minimal information necessary to compute the package_id, but no more. So reproducible builds require lockfiles (which have also been greatly simplified and improved in Conan 2.0). Closing this ticket, feedback (as new tickets preferably) welcome.