Open memsharded opened 8 months ago
I was told to comment here for posterity:
My team has a large repo that consists of a "core" package and hundreds of components that are being packaged up as individual conan packages (with a dependency on the "core" package.) Previously in conan1, our build script that called conan create
in parallel for all of the component packages worked flawlessly (build times of around 3 minutes on our CI boxes) however upon switching to conan2 I noticed that often the conan create
command would error due to the cache concurrency issues. To solve this I have had to run our script in serial mode which massively inflates our build time to approximately 25 minutes (these packages are header-only so in theory we are IO-bound anyways but it appears we are still able to massively benefit from parallelization.) Given that this behavior of building multiple conan packages in parallel seemed to work in conan1, it would be great if the cache concurrency could be improved here in conan2.
Thanks for the feedback @darakelian, this is something that is planned, but it might take a little bit.
In the meantime, you might want to try other approaches to speed up things. For example, you can run things that can run in parallel in different CONAN_HOME
folders, then accumulate packages in the same one with conan cache save/restore
. Avoiding extra downloads can be done by sharing the "download cache folder", which is concurrent and can be shared among other parallel Conan home folders. For building a dependency graph in parallel, the conan graph build-order
will give a list of lists of things that can be safely build in parallel too.
Since the build phases themselves are often already parallel, to me it seems that the main utility of this is to parallelize across the configure steps of all packages, which are always completely serial and therefore under-utilize the machine. I'm also interested, wanting to build a not-insignificant matrix of packages profiles build_type.
Hello,
Not opening a new issue, since I assume the best place for my bug is this issue.
Parallel conan install
's are always failing on an empty Conan package cache. Steps to reproduce:
rm -r ~/.conan2/p
conan install . --output-folder=build-a &
conan install . --output-folder=build-b &
I get errors like:
ERROR: Package 'zlib/1.3.1' not resolved: Reference 'zlib/1.3.1#f52e03ae3d251dec704634230cd806a2%1708593606.497' already exists.
I've set up a small test case to illustrate. Here's the CI run.
I care about this because in my workflow I need to run four conan installs - one for each of Android's architectures. As a workaround, I run one of them, wait for it to finish and then run the three remaining in parallel. It just takes longer to complete. Since I have a workaround, this would be more of a performance improvement, than a correctness issue.
Thanks for the feedback @ViliusSutkus89, good in this thread, yes.
Another possible workaround is to do something like fetch first the recipe (something like one conan graph info ...
), which will be faster than having to wait for 1 configuration, then launch the different builds for the different architectures in parallel. As the race condition seems to be happening in the download of the zlib
recipe, this will likely work better and be faster.
Tried graph generation, It could work, but there's a caveat that different builds can't share any dependencies between them. Previously the error was :
ERROR: Package 'zlib/1.3.1' not resolved: Reference 'zlib/1.3.1#f52e03ae3d251dec704634230cd806a2%1708593606.497' already exists
With the graph pre-generated and with the revisions known, the concurrent write error is triggered a bit later:
ERROR: Reference 'zlib/1.3.1#f52e03ae3d251dec704634230cd806a2%1708593606.497:b647c43bfefae3f830561ca202b6cfd935b56205#6b307bbcbae23635c4006543ffdbf3ef%1708593932.513' already exists
In my test case both builds use the same zlib, because both builds are actually from the same arch. But I've checked and this also happens with different profiles that share the same arch agnostic dependency.
I see, yes, the conan graph info
approach helps in the case of parallel building of the same graph, but there might still be issues with other different parallel jobs. If sharing the same cache, all the conan graph info
+ failed conan install
for all profiles should be launched first sequentially before launching the parallelism. But it seems this would result in a "dirtier" pipeline, so it sounds your original approach would be better here then.
Hi, I noticed #11480 , but would like to do something similar, but slightly different and am now wondering if this is a good idea.
In my setup I would like to export many different recipes in parallel. In my first local tests I did not see any issues. But as concurrency problems tend to turn up in a non-reproducible manner, I would like to know if something like this (pseudo code) is officially possible without potential concurrency issues (latest conan
version):
for recipe_path in set(recipe_paths) do in parallel:
subprocess.check_call(["conan", "export", recipe_path])
Thanks in advance!
Hi @marlamb, https://github.com/conan-io/conan/issues/11480 was pre Conan 2.0
With the Conan2 new PythonAPI, exporting recipes is very fast, might not even need concurrency at all. I'd recommend to do a custom command that iterates and calls the export API. Please try that and open new tickets for any further question about it.
This ticket is to gather and centralize all the related tickets. There are 3 different aspects to make the cache concurrent:
conan config install
and other commands that can change the Conan home configuration concurrentlyOut of the scope:
Conan config install concurrent:
Package concurrency