conan-io / conan

Conan - The open-source C and C++ package manager
https://conan.io
MIT License
8.14k stars 970 forks source link

[Feature] Conan 2 cache concurrency #15840

Open memsharded opened 6 months ago

memsharded commented 6 months ago

This ticket is to gather and centralize all the related tickets. There are 3 different aspects to make the cache concurrent:

Out of the scope:

Conan config install concurrent:

Package concurrency

darakelian commented 4 months ago

I was told to comment here for posterity: My team has a large repo that consists of a "core" package and hundreds of components that are being packaged up as individual conan packages (with a dependency on the "core" package.) Previously in conan1, our build script that called conan create in parallel for all of the component packages worked flawlessly (build times of around 3 minutes on our CI boxes) however upon switching to conan2 I noticed that often the conan create command would error due to the cache concurrency issues. To solve this I have had to run our script in serial mode which massively inflates our build time to approximately 25 minutes (these packages are header-only so in theory we are IO-bound anyways but it appears we are still able to massively benefit from parallelization.) Given that this behavior of building multiple conan packages in parallel seemed to work in conan1, it would be great if the cache concurrency could be improved here in conan2.

memsharded commented 4 months ago

Thanks for the feedback @darakelian, this is something that is planned, but it might take a little bit.

In the meantime, you might want to try other approaches to speed up things. For example, you can run things that can run in parallel in different CONAN_HOME folders, then accumulate packages in the same one with conan cache save/restore. Avoiding extra downloads can be done by sharing the "download cache folder", which is concurrent and can be shared among other parallel Conan home folders. For building a dependency graph in parallel, the conan graph build-order will give a list of lists of things that can be safely build in parallel too.

nextsilicon-itay-bookstein commented 1 month ago

Since the build phases themselves are often already parallel, to me it seems that the main utility of this is to parallelize across the configure steps of all packages, which are always completely serial and therefore under-utilize the machine. I'm also interested, wanting to build a not-insignificant matrix of packages profiles build_type.

ViliusSutkus89 commented 1 month ago

Hello,

Not opening a new issue, since I assume the best place for my bug is this issue.

Parallel conan install's are always failing on an empty Conan package cache. Steps to reproduce:

rm -r ~/.conan2/p
conan install . --output-folder=build-a &
conan install . --output-folder=build-b &

I get errors like:

ERROR: Package 'zlib/1.3.1' not resolved: Reference 'zlib/1.3.1#f52e03ae3d251dec704634230cd806a2%1708593606.497' already exists.

I've set up a small test case to illustrate. Here's the CI run.

I care about this because in my workflow I need to run four conan installs - one for each of Android's architectures. As a workaround, I run one of them, wait for it to finish and then run the three remaining in parallel. It just takes longer to complete. Since I have a workaround, this would be more of a performance improvement, than a correctness issue.

memsharded commented 1 month ago

Thanks for the feedback @ViliusSutkus89, good in this thread, yes.

Another possible workaround is to do something like fetch first the recipe (something like one conan graph info ...), which will be faster than having to wait for 1 configuration, then launch the different builds for the different architectures in parallel. As the race condition seems to be happening in the download of the zlib recipe, this will likely work better and be faster.

ViliusSutkus89 commented 1 month ago

Tried graph generation, It could work, but there's a caveat that different builds can't share any dependencies between them. Previously the error was :

ERROR: Package 'zlib/1.3.1' not resolved: Reference 'zlib/1.3.1#f52e03ae3d251dec704634230cd806a2%1708593606.497' already exists

With the graph pre-generated and with the revisions known, the concurrent write error is triggered a bit later:

ERROR: Reference 'zlib/1.3.1#f52e03ae3d251dec704634230cd806a2%1708593606.497:b647c43bfefae3f830561ca202b6cfd935b56205#6b307bbcbae23635c4006543ffdbf3ef%1708593932.513' already exists

In my test case both builds use the same zlib, because both builds are actually from the same arch. But I've checked and this also happens with different profiles that share the same arch agnostic dependency.

memsharded commented 1 month ago

I see, yes, the conan graph info approach helps in the case of parallel building of the same graph, but there might still be issues with other different parallel jobs. If sharing the same cache, all the conan graph info + failed conan install for all profiles should be launched first sequentially before launching the parallelism. But it seems this would result in a "dirtier" pipeline, so it sounds your original approach would be better here then.