conan-io / conan

Conan - The open-source C and C++ package manager
https://conan.io
MIT License
8.15k stars 971 forks source link

[question] How to improve conan performance? #16652

Open DoDoENT opened 2 months ago

DoDoENT commented 2 months ago

What is your question?

When running conan install of an app with a large number of dependencies, it takes a very long time to calculate the dependencies and generate the output. Furthermore, since we need to do this three times (Release, RelWithDebInfo, and Debug), it takes even longer.

Here is the timing output of one such installation, while having everything already in cache (i.e. no downloads were needed from the Artifactory server):

$ time conan install . -pr emscripten-3.1.57-advanced -pr:b macos-ninja-arm64-clang-15.0.0 -s build_type=Release

real    0m38.607s
user    0m37.488s
sys     0m0.542s

Now, do this three times (Release, RelWIthDebInfo, Debug), and it takes

$ time conan project ninja -pr emscripten-3.1.57-advanced --install-only

real    2m11.346s
user    2m7.001s
sys     0m2.133s

(side note: conan project is our custom command that installs all three build types and automatically detects the correct build profile from the active environment - but it basically invokes conan install three times, using the Conan Python API)

This is a very long wait, especially on a CI system that needs to do this on every build... The project in the example has 73 dependencies and 61 build dependencies listed in its lockfile.

Most of the time is spent after the message "Computing dependency graph" is printed, and I see a single CPU core hogged at 100%.

My questions are:

I'm currently using conan v2.5.0.

Have you read the CONTRIBUTING guide?

memsharded commented 2 months ago

Hi @DoDoENT

Quick question, if all the dependencies are local, can you please try adding --no-remotes to the commands above and report the results?

I guess we are talking about latest Conan 2.5, is this correct?

DoDoENT commented 2 months ago

Yes, conan v2.5.0:

Here is the release-only timed installed with added --no-remote:

$ time conan install . -pr emscripten-3.1.57-advanced -pr:b macos-ninja-arm64-clang-15.0.0 -s build_type=Release --no-remote

real    0m39.635s
user    0m38.148s
sys     0m0.659s
DoDoENT commented 2 months ago

I made sure that I had all dependencies downloaded and built before making this question/perf-bug report. Of course, when conan needs to download and/or build dependencies, it takes even longer, but that is expected...

memsharded commented 2 months ago

Can you please report with --format=json > graph.json what is the size of the dependency graph being resolved?

DoDoENT commented 2 months ago

graph.json.zip I've zipped the JSON because the raw JSON is 71 MB large.

memsharded commented 2 months ago

The dependency graph you are resolving contains around 13450 nodes. At 40 seconds of resolution it brings an average of 3ms to evaluate each node on the graph. Given that there are quite costly operations to do, including loading and parsing python files, loading conandata yml files, propagation of dependencies in the dependency graph, etc, this is not bad.

It is important to note that parallelization of the graph resolution is quite challenging and introduce fragility and issues. We are continuously trying to improve the performance of Conan, but this goes slowly, step by step. Last attempt to paralelize some things had to be reverted because of being problematic.

I'd say that it might be possibilities to optimize the size of the graph, for example, it seems that the explosion of the graph size is due to recursive tool/test requires. Maybe doing something like https://blog.conan.io/2024/07/09/Introducing-vendoring-packages.html can highly reduce the size of the graph.

DoDoENT commented 2 months ago

OK, I thought so...

I'll see if there is a way to help myself with vendoring...

And, what about this question?:

is there a way to install both Release, RelWithDebInfo, and Debug versions of packages "in a single run", thus saving the time required to compute the graph?

DoDoENT commented 2 months ago

Since graph computation is expensive, maybe if done "single time" instead of being repeated for each build type, the total project initialization could be much faster?

memsharded commented 2 months ago

There are conditional requirements to the build_type (and other settings), this is not possible because the dependency graph is often different for different settings, so caching and reusing graphs is unfeasible.

DoDoENT commented 2 months ago

I know it's possible to have different dependencies based on the build_type, but assuming this rarely happens, only a small amount of nodes would have extra expansions due to different build_types, so overall most of the nodes would be the same for all build types, which can then be reused when doing the graph computation and thus save time?

Or is this difficult to achieve with the current conan architecture?