[feature] conan graph explain should mention if a missing prebuilt binary exists for a different recipe revision

conan-io / conan

Conan - The open-source C and C++ package manager

https://conan.io

MIT License

8.22k stars 979 forks source link

[feature] conan graph explain should mention if a missing prebuilt binary exists for a different recipe revision #16111

Closed iaavo closed 6 months ago

iaavo commented 6 months ago

What is your suggestion?

conan graph explain is an amazing feature and helps a lot in debugging, thanks for adding that! Unfortunately I came across one situation where it didn't properly explain.

The problem stems from the confusion introduced by some users running conan install . while others use conan install . --update. Even when both commands resolve to the same version, they can still resolve to different recipe revisions.

This can happen in the case when there are two remotes enabled wherein they contain different recipe revisions. Lets take a concrete example:

remotes:

conan-prod-local
- contains:
- pkg/1.0#rrev1:pkgid1
- pkg/1.0#rrev1:pkgid2
conan-dev-local
- contains:,
- pkg/1.0#rrev1:pkgid1
- pkg/1.0#rrev1:pkgid2
- pkg/1.0#rrev2:pkgid1

Consider pkg/1.0#rrev2:pkgid1 having been accidentally uploaded by a dev. Now if we have a different pkg app/1.0 requires pkg/1.0 and having an empty local conan cache:

conan create app/1.0 success, since the conan-prod-local is the first remote to be resolved
conan create app/1.0 --update fail -> missing prebuilt binary for pkg/1.0#rrev2:pkgid2

To be clear, it is an oversight in our build environment that it came to this issue in the first place. But debugging it was quite difficult. Which is why I'm suggesting that conan graph explain should mention that the current recipe revision is missing the requested prebuilt binary but a fitting prebuilt binary exists for a different recipe revision.

Have you read the CONTRIBUTING guide?

[X] I've read the CONTRIBUTING guide

memsharded commented 6 months ago

Hi @iaavo

Thanks for your suggestion.

However I am afraid that this is an intractable problem. Please note that every revision of every recipe creates a totally different dependency graph, as they can use different versions of the dependencies, for example. In order to be able to evaluate things correctly, the dependency graph that conan graph explain is expanding should evaluate all possibilities for every recipe in revision, by fetching all revisions, evaluating each revision in a new hypothesis of the dependency graph. This can takes days to evaluate, not even hours.

Even if we could try to optimize the problem and start from the missing binary, still the problem is highly combinatorial, and as you move down the dependency graph, you might need to do backtracking and re-evaluate hypothesis for other packages as well, as everything is affected when you change revisions. This would make the problem also intractable.

iaavo commented 6 months ago

Thanks for the quick reply,

I see that dealing with such a big search space could be difficult. But that's exactly why automating it would be so beneficial. It would save a ton of time compared to human troubleshooting. Since conan graph explain is for diagnosing, performance might not be the top priority. Though we can still get a reasonable execution time by limiting the hypothesis space to a certain default search depth. I think that would still find 99% of problems, as cases where different recipe revisions causing a combinatorial explosion are probably quite rare. Though I can be wrong here.

I want to stress once more, how great of a feature the addition of conan graph explain is. It really makes diagnosing a lot easier and I would love to see it get improved even more in the future.

memsharded commented 6 months ago

I see that dealing with such a big search space could be difficult. But that's exactly why automating it would be so beneficial. It would save a ton of time compared to human troubleshooting.

No, it doesn't. If you type conan .... and it takes hours to evaluate, the chances that it will not be killed are 100%. And users will complain that it is very slow and must be optimized. But it cannot, it is a combinatoric problem in nature, and evaluating each hypothesis in that space is not cheap, because it involves downloading, unzipping, python loading and evaluating it, it is super expensive.

Though we can still get a reasonable execution time by limiting the hypothesis space to a certain default search depth. I think that would still find 99% of problems, as cases where different recipe revisions causing a combinatorial explosion are probably quite rare. Though I can be wrong here.

The thing is that regarding revisions, this is similar to backtracking to previous versions searching for a binary: it is almost never the right solution. Is like going to an older commit in source and using it in production without even making it the HEAD or the tagged release one. If there are new recipe revisions, those are in the vast majority of cases the ones that must be used, and if a binary is missing for that new revision, it should be built, but not trying to use an older revision just because the binary is there.

A different use case is using lockfiles, you might want, and it is a very common use case, to be able to reproduce an exact dependency graph later in time, for reproducibility. This is not an issue, the lockfile explicitly tells the revisions to use. You can use conan graph explain and provide a lockfile.

In any case, the problem is still intractable both in development time and in execution time. The complexity that such an algorithm will add to Conan is huge, really huge, and it will result in a command that will in tons of cases take way more (hours, days) to evaluate, which is a really no-go, even if automated, it will create more pain both on users and on us than not providing it.

I understand your view, and we also love conan graph explain and we are willing to keep improving it for any case that is reasonable, but I am afraid that unfortunately this one is really impossible at the moment.

iaavo commented 6 months ago

No, it doesn't. If you type conan .... and it takes hours to evaluate, the chances that it will not be killed are 100%. And users will complain that it is very slow and must be optimized. But it cannot, it is a combinatoric problem in nature, and evaluating each hypothesis in that space is not cheap, because it involves downloading, unzipping, python loading and evaluating it, it is super expensive.

You got me convinced, I concede my point :)

The thing is that regarding revisions, this is similar to backtracking to previous versions searching for a binary: it is almost never the right solution. Is like going to an older commit in source and using it in production without even making it the HEAD or the tagged release one. If there are new recipe revisions, those are in the vast majority of cases the ones that must be used, and if a binary is missing for that new revision, it should be built, but not trying to use an older revision just because the binary is there.

Well, the point is to detect when something like this happens and then correct the mistake. Maybe there is another way to go about it? If it would have been a different version that is getting unexpectedly resolved, it can be detected quite easily. Though I think with revisions there is a vital difference in that they are not human readable. It is difficult to detect when an unexpected revision is getting resolved. (half serious: it would be much easier if instead of a long hash it would be a list of unicode emoticons)

In any case you got me convinced that increasing the complexity of conan graph explain is maybe not the right approach. As such, this issue can be closed.

memsharded commented 6 months ago

Thanks for your understanding and your feedback!