conan-io / conan

Conan - The open-source C and C++ package manager
https://conan.io
MIT License
8.14k stars 970 forks source link

Package revisions #798

Closed annulen closed 5 years ago

annulen commented 7 years ago

Most package managers have a concept of package revision, i.e. additional version number that reflects changes in packaging scripts or applied patches when "main" version number of packaged software remains the same.

It would be great if Conan added support for revisions too. This will make package updates more trasparent ("updated from vX.Y.Z-r1 to vX.Y.Z-r2"). Also there could be a policy that "stable" channel can never change conanfile and binaries without bumping revision, to prevent accidental changes in packages used in CI with manifest verification.

It would be great if it was possible to keep binary packages for previous revisions so that CI system with manifests checking does not get broken in case new revision is uploaded without committing new reference manifests.

It was previously briefly discussed at https://github.com/conan-io/conan/issues/480#issuecomment-247545547

memsharded commented 6 years ago

There is something in @sztomi approach that I like. Being very explicit, use automation to support being explicit. Full reproducibility, always.

Regarding your questions above, I think going with the 1.2.3-something is not the way to go. As specified by semver, they are pre-releases, which are quite opposite usage to revisions. Pre-releases are only used explicitly, while revisions are intended by default.

Not having the revision in the package reference is not possible if we want to satisfy the requirement of being able to explicitly opt-in to retrieve a specific revision. Using version-ranges by default, so Pkg/1.2.3@user/testing automatically resolves to the latest revision is not feasible, it doesn't scale and performance penalties are not possible for most users.

So, if anything, maybe the way to go would be to force semver, and to use revisions as the fourth version element X.Y.Z.R and consume it with something like Pkg/[X.Y.Z]@user/channel. I don't know to which extent this is possible without having to rebuild the node-semver library or without breaking existing behavior.

sztomi commented 6 years ago

maybe the way to go would be to force semver

I disagree for a variety of reasons, but it all boils down to real-world use-cases.

In my experience, semver is one of those things that is designed and works well in theory, but in practice it's a false sense of security. Some projects that claim to use semver fail to deliver on the stability promises. Other projects just use a version that looks like semver, but isn't. You might not be able to add the revision as the forth version element because a particular package might already have a forth element in their version. It's all a crazy mess and establishing a standard on the package maintainer level causes more harm than good, imo. Libraries might also decide to change their versioning scheme from one version to another.

In our implementation we actually have a separate attribute for the revison number, but it is ultimately appended to the version number for conan (by our conanfile base class).

mmatrosov commented 6 years ago

@sztomi thanks for sharing your workflow! We wanted to use version ranges in recipes, but always specify concrete versions in conanfile. This way we won't need to change recipes when we introduce a new revision or version, but will still have perfect reproducability.

@memsharded did you think of expanding reference to something like Pkg/1.2.3/4@user/channel, or Pkg/1.2.3#4@user/channel, where only the part 1.2.3 is considered to be semver, while 4 is treated as revision?

memsharded commented 6 years ago

We have done some preliminary analysis of a server-side solution of this feature, which could be interesting. However, this requires development in all servers simultaneously. Moving this to 1.3, and for that release hopefully we have a full proposal of how it will be in the servers, but unfortunately won't be possible to have a full implementation.

DavidZemon commented 6 years ago

Glad to see this is being discussed and maybe even worked on. This has been plaguing me ever since I first started using Conan and it only just now occurred to me to search for an open issue. For reference, I like the Pkg/1.2.3#1@foo/bar syntax the most, with another / instead of the # being the only other option I see as reasonable. I think manually specifying the package revision should be optional, but that all packages should have it. Recipes in a cache/server generated prior to this feature will be treated as package revision of 0 and new recipes that don't provide a revision = ... attribute will default to 1. For the sake of speed, specifying requires = 'pkg/1.2.3@foo/bar' should resolve as revision 0 but throw a LOUD warning that this is deprecated and that a revision number should be added. Conan v2.0 can remove support for requirements w/o revision (unless range is used of course).

I would go for the name "release" or "revision", though prefixing either with "package_" would be reasonable too given it's far more obvious naming.

ringgelerch commented 6 years ago

Any updates on the time schedule when this issue will be solved. Revision support is a important feature for us as well.

danimtb commented 6 years ago

Hi @ringgelerch,

There is a huge WIP #3055 and we have been internally discussing all the use cases of this feature. We know it is an important feature in our roadmap but cannot commit to a release date yet.

Thanks for your patience and stay tuned.

DavidZemon commented 6 years ago

Excellent news! Thank you for the update!

bradenmcd commented 6 years ago

I'll add my voice to those who've expressed a need for this feature.

We build several open source packages as dependencies for software we write. These need to be built with particular flags/options that are important to us; and frequently the upstream source must be patched. (We build dependencies as static libraries; but that's not terribly important for this discussion.) The particular revision of the build scripts and patch set needs to be expressed in the identity of the binary package; and it doesn't make sense to use the package's version for this (since we don't "own" that).

This scenario is very nearly the same as that for creating packages for Linux distributions; and, accordingly, we need a release number for pretty much the same reasons.

I'll also add that it's useful to have that release number—or, more accurately, release version—follow the expression [0-9](\.[0-9])*, rather than use a single integer. This can come in handy should the need arise to branch the package build scripts. E.g., I've created package 1.1.2 release 4 and the current release of my software uses it; but a previous release of my software using 1.1.2 release 3 needs a fix to the dependency that would not be satisfied by upgrading to 1.1.2 release 4 (or upgrading to that release is otherwise problematic). It makes the most sense to branch from the 1.1.2 release 3; but, of course, this new package release needs a new release number. I don't want to call it 5; because this should not be seen as an "upgrade" for users of 1.1.2 release 4. Instead, I want to call it 1.1.2 release 3.1.

DavidZemon commented 6 years ago

@bradenmcd, I would actually argue that either the revision need to be fully semantic, or just a single integer (and personally, I'd prefer single integer). Your use case is just one tiny step away from someone else saying "I'd like to fully express major/minor/patch updates."

bradenmcd commented 6 years ago

Your use case is just one tiny step away from someone else saying "I'd like to fully express major/minor/patch updates."

@DavidZemon, it's really not. As my example demonstrates, it's about identifying a new release that falls logically between existing fielded releases. That is all it's about; but, you cannot do that with a single integer.

Semantic versioning is about what increments to particular version components mean for a software interface. That's a huge leap (requiring gobs more logic to understand) relative to what I'm asking for. I'd also suggest that "semantic versioning" as it might apply to a binary package release version (separate from an software package version) is not something that's at all well-defined; thus a request for it in this context is not meaningful without providing that definition.

lasote commented 6 years ago

With the current Conan/revisions model, if you change the "version" field you have a different package, with its own revisions. But I think, in the scenario you described, it is correct, has to be a different package (1.1.2 release 3.1 or whatever) because users will depend explicitly on that version and the users have to perceive the change. The revisions mechanism is mostly a way to guarantee reproducibility and resolving the latest, the current POC (more than a POC now) is based on:

mmatrosov commented 6 years ago

The "revision number" is based on a commit hash of the code or the hash of the files when the SCM is missing. So you keep a direct correlation between your SCM and your package revisions.

That's interesting. According to this POC, will the revision be a part of a reference? And if yes, could you please give an example on how such reference might look like?

lasote commented 6 years ago

According to this POC, will the revision be a part of a reference? And if yes, could you please give an example on how such reference might look like?

Yes, it will be part of a reference but usually it won't be specified in a requirement but part of a freezed reference at a lock files, but you could do it. The revision will be a commit hash:

lib/1.0@conan/stable#d763a453f0940f06f85102fbcc519c3c694caf8f

You could "go back in time" a revision by looking at the git log of your scm.

groovyd commented 5 years ago

I am very encouraged to see how much consideration of this issue has already happened and would only like to cheer for the conan devs in hopefully releasing something that solves the issue soon. Just wondering what is the plan for roll-out of this or is it still very much a design effort?

groovyd commented 5 years ago

According to this POC, will the revision be a part of a reference? And if yes, could you please give an example on how such reference might look like?

Yes, it will be part of a reference but usually it won't be specified in a requirement but part of a freezed reference at a lock files, but you could do it. The revision will be a commit hash:

lib/1.0@conan/stable#d763a453f0940f06f85102fbcc519c3c694caf8f

You could "go back in time" a revision by looking at the git log of your scm.

while this appears very clever it doesn't seem incrementally intuitive for a user? for example just staring at some hash values i have no idea if it is more recent then any other unless i have access to the scm. in the case of no scm the file hash tells me nothing about which one is more recent. it is also in practice hard to have quick discussions about packages and local version history if you need to remember hashes. something like 'hey are you using revision 1 or 2 because i heard 3 fixes both issues.'

DavidZemon commented 5 years ago

That's a pretty annoying way to do it... I think understand why... but I don't think I like it. But maybe I'm just being adverse to change here.

But just like groovyd said, this is really verbose and does not work well for humans talking to each other about what version to use.

Is there a way to force a specific revision number in the recipe, instead of letting Conan auto-compute the hash? I certainly like the idea of the hash as a fallback if no other revision was provided - it's brilliant. But I don't like it as a forced option that can't be overridden.

memsharded commented 5 years ago

@groovyd The work on revisions is quite advanced, some parts like API v2 have already been merged. We are investing lots of time in this it is a high priority for us. Now we are researching the model and flows how to actually consume that revisions, use them in CI and "lockfiles" to have reproducible builds. A challenging problem, but also a high priority for us.

@DavidZemon I am proposing the possibility to let users to define their own revisions numbers, both for recipes and binaries (like the buildID of the CI build, for example). Still need to be discussed, but I have that in mind,.

Thanks both for the feedback!

groovyd commented 5 years ago

cheers and thanks for listening.

On Oct 3, 2018, at 7:02 PM, James notifications@github.com wrote:

@groovyd The work on revisions is quite advanced, some parts like API v2 have already been merged. We are investing lots of time in this it is a high priority for us. Now we are researching the model and flows how to actually consume that revisions, use them in CI and "lockfiles" to have reproducible builds. A challenging problem, but also a high priority for us.

@DavidZemon I am proposing the possibility to let users to define their own revisions numbers, both for recipes and binaries (like the buildID of the CI build, for example). Still need to be discussed, but I have that in mind,.

Thanks both for the feedback!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

earonesty commented 5 years ago

Just an FYI (and maybe i'm doing something wrong). I use the commit hash as a channel in the ci/cd to deal with this issue. Seems to solve (so far)

memsharded commented 5 years ago

Hi @earonesty

No, that is not a bad approach, and it can work. The main issue is that it might require a lot of job in the CI side, when you want to propagate that channel down the dependency graph, so consumers start to use that specific commit of the dependency. Could you share a little bit more how do you apply it in the consumers of those packages? Thanks!

earonesty commented 5 years ago

We just stick: vapilib/1.0@vida/4e95ec56ced10b333647b69dcfae60fc789bc209 in the conanfile,txt of dependent modules. Then those modules consume the correct commit and we don't have to worry about new versions of vapilib breaking things on a daily basis.

Later, if someone wants to get the most recent vapilib, they can just git ls-remote --refs git@gitlab.com:vidaid/vapilib.git stable to get the latest ref to use.. which we have enshrined in a "update-conanfile [tag] [libname1...] " script that walks through and updates refs (by default "everything/stable")

Of course if we have two dependent modules set to consume different refs, things can get wacky.... but that's the same as versions (only worse). Seems ok (to me) for "new projects" or fast-moving things that don't have deep dependency trees.

lasote commented 5 years ago

The package revisions feature has been deployed in Conan 1.10 in the client and conan_server but still locked. If anyone is interested in giving it a try ping me, I can tell you how to unlock the feature and play with it. Thanks!

DavidZemon commented 5 years ago

This guy is interested! (Email removed for privacy)

mmatrosov commented 5 years ago

@lasote where can we read about chosen design? Cannot see anything on the docs.

lasote commented 5 years ago

Hi, @mmatrosov sorry for the delay. I created a wiki page here: https://github.com/conan-io/conan/wiki/Revisions-experimental-feature explaining what it is, how to try it and what is the release plan. Thanks for your interest.

iiknd commented 5 years ago

I found this ticket accidentally and it appears to solve the "issue" I've tried to describe here: https://github.com/conan-io/conan/issues/4569

As far I've understood this feature could be used to "pin" the latest package, including the dependencies. The documentation was a bit unclear to me https://github.com/conan-io/conan/wiki/Revisions-experimental-feature?

How would it look like if I wanted to package pre-built binaries from producer and consumer point of view?

project/1.0@user/rc1 # nightly snapshot, depends on A/1.0 and B/1.0
A/1.0@user/rc1 # nightly snapshot
B/1.0@user/rc1 # nightly snapshot

Like this?

Producer: project/1.0@user/rc1#1.0-1234 requires: "A/1.0@user/rc1", B/1.0@user/rc1 # should pick the latest revision?

Consumer:

conan install project/1.0@user/rc1 # should pick the latest revision?

danimtb commented 5 years ago

hi @unzap,

Yes, this feature will fit the use case you described in #4569.

The idea is that everytime you create a new package with the same reference (lib/version@user/channel), Conan will create a new revision creating a hash of the recipe content or the commit from git/svn (lib/version@user/channel#recipe_revison).

This also works for package references (lib/version@user/channel#recipe_revision:package_id). Even if the recipe does not change but you change the binary in the package (you make changes in the source code of the library), Conan will also create a new package revision (lib/version@user/channel#recipe_revision:package_id#package_revision).

So you would have this kind of structure if you upload it to the remote:

lib/version@user/channel
  - recipe_revision1
    - package_id1
       - package_revision1
       - package_revision2
    - package_id2
       - package_revision1
  - recipe_revision2
    - package_id1
       - package_revision1
       - package_revision2
       - package_revision3

The interesting part here is that consumers wouldn't have to change the reference being installed.

If they install the normal referece they will get the latest recipe revision and the latest package revision suitable for their configuration:

In the above example, a conan install lib/version@user/channel (let's suppose we are looking for a package with the settings hashed in pacakge_id1) command will download the following full referece: lib/version@user/channel#recipe_revision2:package_id1#package_revision3

Hope it helps to clarify the idea on this new feature and that you find it useful! 😄

iiknd commented 5 years ago

hi @danimtb,

Hope it helps to clarify the idea on this new feature and that you find it useful!

Yes this feature is very welcome!

A couple questions still :)

Correct me if I understood it wrongly but it looks fully automatic from consumer and producer perspective? Or do I (producer) need to add the revision, e.g. append the nightly CI snapshot number after the channel like "project/1.0@user/rc1#1.0-1234"? If the revision feature is automatic what is the mechanism to remove (clean) old revisions from server, how to find/reference those in the "remove" command?

Does this work along with "alias" command if I want to e.g. flag "rc" as "final", will the "final" alias point to the latest revision automatically (my assumption, but just checking).

danimtb commented 5 years ago

Yes, from the producer side it is currently automated by Conan and no input revision can be given, but it is something we are considering to implement. However, I think the commit hash will be more useful to track the changes as you can make a direct association source code change <> new conan package.

Regarding the commands to work with revisions, if you don't specify it manually you will get the latest resolved. However you would be able to reference the recipe revision in the installation/requirements: conan install lib/version@user/channel#recipe_revision2

The same but including packages can be done for removal/deletion: conan remove lib/version@user/channel#recipe_revision2:package_id1#package_revision3 --remote remote1 although this is something that you won't be doing often as it could affect you regarding traceability of package generation.

There are also commands for searching recipe and package revisions too! conan search lib/version@user/channel --revisions --remote remote1 and conan search lib/version@user/channel#recipe_revision2:package_id1 --revisions --remote1

Regarding alias, I am not sure about they behavior with revisions but I's say you could point them to any reference with or without recipe revision

mmatrosov commented 5 years ago

Allow me to play a little pessimistic chord here. The whole design looks a little over-generalized to me. Especially the decision to use hashes as revisions. Let's consider your example from above (https://github.com/conan-io/conan/issues/798#issuecomment-467797171). Substituted with real values it would look like this:

lib/version@user/channel
  - recipe_revision_10cfc30b3a761f7052a545d22d0d52953c636800
    - package_id_1ce3a9dce5ee5f7adb42f16ea1e812d187502583
       - package_revision_f1e462b1d8d9cffdad98b1c373113671621dd88e
       - package_revision_d7c16de57903c4a94e17a480a9b1552d40df13a9
    - package_00b17467a14c6546026c59bbc3724e711d9f199d
       - package_revision_2941f23f8ae5f96996d70d11dc029b644e18492d
  - recipe_revision_8e8b05bbaf9d18277ee41c052cb2cbc44080cd0b
    - package_id_c2053771cd94bcb3078d620d01e11e5cf97726b8
       - package_revision_a48c9ba329b778e36dd6d261fadbc3ff67068277
       - package_revision_9ab2e86421bdd7e90c17c715923353634c482b8c
       - package_revision_ca586123ce701b4bc3d88d74f5606d28c8e965c0

Simple question: how do I order these things? We have already had hashes as packages ids from the very birth of conan, but this is ok, since order on packages does not make any sense. But it does for revisions.

Looking on your implementation, I feel like we will continue using our custom homebrew approach, see https://github.com/conan-io/conan/issues/3158#issuecomment-402129907.

iiknd commented 5 years ago

Yes, from the producer side it is currently automated by Conan

Good, this should make it easier from producer and consumer point of view

although this is something that you won't be doing often as it could affect you regarding traceability of package generation.

@danimtb In our case probably not daily but weekly, in the "worst" case one week can produce roughly 50G-100G worth of snapshot binaries so we want to clean up old unused revisions.

Ps. it would be a handy feature to have it work something like this:

conan remove --follow-dependencies --clean-old-revisions=<some rules here, e.g. keep the latest only etc.>

@mmatrosov

I feel like we will continue using our custom homebrew approach

Afaik changing the version number in "reference" will produce new package repositories, how the consumers can update the packages as the reference has changed? Or are you using the alias mechanism here?

Edit: Is the work flow (consumer) below actually possible?

> conan install lib/1.0-1@user/channel
> conan install --update lib/1.0-2@user/channel  # revision changed

I.e. can you update a package from a different reference?

mmatrosov commented 5 years ago

@unzap

Afaik changing the version number in "reference" will produce new package repositories,

If by "new package repositories" you mean "new almost completely unrelated packages", then yes, you are correct.

how the consumers can update the packages as the reference has changed?

Manually. The whole approach was invented to ensure reproducibility of our builds. I.e. when someone checks out particular git revision in our source code and builds the solution, they will obtain exactly the same packages, no matter when they do it. Even if there are some more modern revisions uploaded on the server.

This means, when we prepare a new revision of a package, no one gets it automatically. We have a special file called CurrentReferences.yaml which is stored and updated alongside with our code, which contains, well, currently used references for all libraries that we use. Thus, if you want your new revision to actually participate in the build, you update corresponding reference in CurrentReferences.yaml. Simple, traceable, reliable, reproducible.

Same for recipes that reference updated recipe. Want to reference latest revision? Go and manually update requires field. By "manually" I mean it could easily be done with an automated script, but anyway it will be directly reflected in the code of the recipe. And yes, updating requires of a recipe means it gains new revision. Thus, when we update a revision for a particular recipe, it should be propagated downstream, updating revisions for all referencing recipes. All these updates should be reflected in CurrentReferences.yaml.

If something goes out of sync, you might have different libraries referencing different versions (meaning different revisions) of a single library. Surprisingly, conan is ok with this. That's why I created https://github.com/conan-io/conan/issues/2800. Please go and cast a vote :)

Or are you using the alias mechanism here?

No, we do not.

Edit: Is the work flow (consumer) below actually possible? "conan install --update ..."

I believe not, because we actually have different references for every revision.

danimtb commented 5 years ago

@mmatrosov

Simple question: how do I order these things? We have already had hashes as packages ids from the very birth of conan, but this is ok, since order on packages does not make any sense. But it does for revisions.

Actually new revisions will get a timestamp in the server and you would be able to see the chronological order of recipe revisions in the output of conan search <reference> --revisions (same concept for package revisions).

Your approach above seems reasonable if you are controlling the automatic bumping of requirements in a recipe when you want to update them, but you have to relay on something external probably automated in your CI or git hook.

Regarding reproducibility, we are working on the concept of "graph locks" to get files that can be used to reproduce all the dependencies and their relations used in a conan install so you can get the exact same output of a CI build for example without forcing to have "pinned" references.

Finally, I forgot to say that the revisions feature would be something opt-in via conan.conf/env var for the moment and you would still be able to use the same mode without revisions.

mmatrosov commented 5 years ago

we are working on the concept of "graph locks"

This looks intriguing. Could you please provide some details?

the revisions feature would be something opt-in

I'm not a fan of "opt-in" features. Roughly speaking, if a feature is good, make it available for everyone. If it is not, why it is needed at all? Do you plan to make it available by default eventually?

kenfred commented 5 years ago

I proposed the concept of the "graph lock" in #101. It is similar to the concept used in npm or yarn. It didn't seem to get any traction with the team. I think the lack of this is a critical hole in conan and an invitation for build reproducibility problems.

DavidZemon commented 5 years ago

@kenfred,

I think it would be a great idea if you opened a new ticket with your lock file idea. I would certainly vote for it. My team was planning to implement our own version of a lock file because we absolutely need that level of reproducability for a given build of a project.

danimtb commented 5 years ago

@mmatrosov

I'm not a fan of "opt-in" features. Roughly speaking, if a feature is good, make it available for everyone. If it is not, why it is needed at all? Do you plan to make it available by default eventually?

The revisions feature changes the behavior of Conan in some flows and we cannot break it. Probably they will default in Conan 2.0

kenfred commented 5 years ago

@DavidZemon While opening an issue, I came across #1042. I'm going to add some comments there.