Global lock/mutex to prevent multiple MSBuild instances building the same project concurrently

KirillOsenkov commented 9 months ago

Suppose you have more than one MSBuild invocations (either both command line, or a build from an IDE such as VS that is concurrent with a command-line build).

Currently we have no defense mechanisms against concurrent builds originating from different processes.

Let's brainstorm about perhaps taking a global mutex around the proj file being built (at per-project granularity). It's not clear at all what granularity should be used (around a project, around a solution?). Also the same project can be built multiple times with different targets, global properties, etc. Some of these builds have no side-effects, but some of them do. Do we only lock around builds that have side-effects?

Need to be careful here to allow multiple MSBuild.exe nodes belonging to the same build invocation to bypass the locking.

I understand it's a very fuzzy feature request, and it's going to be hard to get right, but I feel like we need some long term investment in this space. Otherwise if someone accidentally starts a build from the IDE while a command-line build is running, we can run into corruption and race conditions that are very hard to diagnose.

cc @xoofx

rainersigwald commented 9 months ago

What concrete problems have you observed with the racing builds? I have seen transient file-in-use errors, which are resolved by a subsequent build. In my experience I haven't seen this be a problem that seems to need solving so I'd need some data to convince me otherwise.

baronfel commented 9 months ago

It's pretty common when you have multiple components that are all orchestrating builds to hit this situation IMO. Imagine a VSCode user that has:

Dev Kit for C#
Ionide for F#
some build/test tasks that drive MSBuild commands

All of those components are interacting with the build in different ways, and some at the same time! Because of the lack of a shared project system between the language extensions you can get this kind of stomping. And at any time a user can trigger a custom build step in MSBuild that can interact poorly with what the language servers are doing. I expect we will see this more in a VSCode world because of how simple it is to configure these out-of-band tasks compared to VS.

rainersigwald commented 9 months ago

Tell me more about the "stomping" though. Is it just "when you do a bunch of builds in parallel they can transiently fail and the fix is to do one at a time"? Because that doesn't seem worth a bunch of engineering effort to fix.

baronfel commented 9 months ago

Yeah, that's the failure mode. More broadly there's no shared central coordinator (though MSBuild Server could potentially serve this role going forward) and it causes some non-zero amount of user pain. I'm happy if the longer-term answer here is 'move clients over to MSBuild Server and consolidate access through that' though :)

KirillOsenkov commented 9 months ago

Let's keep this as a long term issue, I marked it as distant-future. Over the years I found myself wishing for this occasionally, but I thought the same thing you did: "lots of hard engineering for transient benefit" and didn't bother filing a bug.

However there's been just too many occasions where I'd wished for this, and I wish I filed this earlier and was using this issue to gather all that data. Definitely agree we need to gather data before committing any serious resources to this.

What tipped the scale yesterday was an external third-party editor competing with VS in terms of who does the build, and they could run concurrently without any mechanism to synchronize. If there was a way for the editor to wait until the IDE finishes the build (either design-time or real), it would help avoid various race conditions. I imagine if you open the same solution in both VS and VS Code C# Dev Kit you might run into the same set of issues.

Last time I wished for this feature a solution was open in the IDE, and I triggered a command-line build. Since the IDE is watching the file system for file changes, some files were changed on disk and that triggered a design-time build in the IDE. The design-time build ran concurrently with the real build and picked up some transient state before some generated files were written by the real build. It failed with an obscure symptom that took a while to diagnose.

Quite often I see people triggering a command-line build while the solution is open in the IDE, so the design-time build runs concurrently with the real build. Off the top of my head the problems I've seen here are:

generated .baml.cs files being deleted and recreated (resulting in broken WPF)
errors with missing references (System.Object not found)
generated AssemblyInfo.cs issues flip-flopping between debug and release (if Debug is open in VS and Release is built from command line)
NuGet restore issues (can't remember now!)

All of these were hard for me to investigate (especially since getting binlogs out of VS is an ordeal and you can't do that after the fact). Imagine regular users facing these types of issues.

KalleOlaviNiemitalo commented 9 months ago

generated AssemblyInfo.cs issues flip-flopping between debug and release (if Debug is open in VS and Release is built from command line)

Shouldn't separate configurations generate AssemblyInfo.cs files into separate directories anyway…? For the sake of incremental builds too, not just for parallel.

KirillOsenkov commented 9 months ago

My bad, AssemblyInfo.cs is already per-configuration. I must be misremembering. I think long back it was generated into a temp directory and shared across configurations?? I don't remember any more.

KirillOsenkov commented 9 months ago

I'm actually surprised that the 5 NuGet-related files such as project.assets.json, the dgspec.json, the cache, the props and the targets are not per-configuration. So if you have PackageReferences conditional on the configuration it seems like they may overwrite each other?

KalleOlaviNiemitalo commented 9 months ago

Right, $(Configuration) isn't supported in PackageReference conditions. It's documented at https://learn.microsoft.com/en-us/nuget/consume-packages/package-references-in-project-files#adding-a-packagereference-condition:

You can use a condition to control whether a package is included, where conditions can use any MSBuild variable or a variable defined in the targets or props file. However, at presently, only the TargetFramework variable is supported.

In my experience, NuGet seems to enumerate the TargetFrameworks and then get the PackageReference items for each TargetFramework. So the conditions can use custom MSBuild properties whose values depend on TargetFramework. A condition that depends on the version number of .NET SDK (NETCoreSdkVersion or NETCoreAppMaximumVersion) also mostly works but it makes package lock files not portable across SDK versions.

(For some development-dependency packages though, it is possible to work around the $(Configuration) restriction by referencing the package unconditionally but setting properties that affect what the props and targets of the package do in each configuration.)

KalleOlaviNiemitalo commented 9 months ago

I think long back it was generated into a temp directory and shared across configurations??

TargetFrameworkAttribute was written to temp before https://github.com/dotnet/msbuild/pull/5101.

timcassell commented 5 months ago

There is an issue in BenchmarkDotNet with building the project with multiple configurations in parallel. https://github.com/dotnet/BenchmarkDotNet/issues/2425

I wasn't able to pinpoint the cause, but it sounds like this could fix that.

elisauhura commented 1 month ago

I have a use case that also would be greatly improved by having some mechanism of global coordination. Here is a quick description:

We have a monorepo and we want it to be buildable by a sequence of tool restore, restore, and build for a series of design decisions.
Some of the internal projects are used to generate source files.
Those projects executables needs to be available before compilation of projects that depend on it.
Generation is managed with custom target files that invoke MSBuild and then the code generators.
We also need to support a solution file that builds the entire monorepo.

While this strategy worked fairly well with projects only targeting a single framework, supporting multiple frameworks lead race conditions. We should be able to coordinate tasks that manipulate shared resources with a named locks at the target level.

The locks should work across processes and have timeouts to prevent dangling locks from creating a frustrating experience to the user in case of failures.

This also allows for synchronization across multiple targets.

rainersigwald commented 1 month ago

While this strategy worked fairly well with projects only targeting a single framework, supporting multiple frameworks lead race conditions.

This should be fixable; specifically the source-generator projects should be referenced in such a way that they only build for the one relevant TF. Can you give an example of how it's going wrong for you @elisauhura?

elisauhura commented 1 month ago

Sorry for the formatting, I'm writing this from my phone.

Source generation in our project is done by running an executable invoked directly by ms build, we don't use source-generators (but this is in our backlog).

The ms build target runs before "BeforeBuild" and our main pain point is that with multiple frameworks, the .net 6 and the .net 8 build race to read the files used to generate.

We need access to properties generated from the package references tagged with GeneratePathProperty and make sure the target being executed for .net6 and .net8 do not race against each other. There is no difference in the generated files for both target frameworks.

rainersigwald commented 1 month ago

The ms build target runs before "BeforeBuild" and our main pain point is that with multiple frameworks, the .net 6 and the .net 8 build race to read the files used to generate.

We need access to properties generated from the package references tagged with GeneratePathProperty and make sure the target being executed for .net6 and .net8 do not race against each other. There is no difference in the generated files for both target frameworks.

This is what I'd like to explore. What's the race? Is it

a. building the generator tool b. Executing the generator tool, while reading project stuff c. Executing the generator tool, while writing output

?

elisauhura commented 1 month ago

Executing the generator tool to write the files @rainersigwald.

Reached the conclusion on my side to force the build of projects that target multiple frameworks to be executed sequentially.

KalleOlaviNiemitalo commented 1 month ago

For a similar generate-source-files scenario, I set up a separate project file that only runs the tool; other projects then use the MSBuild task to run that. The intention was that each task execution would use the same global property values, and MSBuild would build the project just once and reuse cached results for subsequent task executions. IIRC, this solved the file-in-use errors that we had had in multitargeting builds before. I don't remember whether I used Project/@TreatAsLocalProperty or MSBuild/@RemoveProperties or both.

I'm not sure whether this kind of thing will be compatible with graph build.

elisauhura commented 1 month ago

I'm experimenting today with migrating the source generators to IIncrementalSource generators. My conclusion is that with the current design of the toolchain they will provide a long term solution for my problem.

It is not viable to expect the build system to enable all possible scenarios and use cases, so a little refactoring seems valid on my side.

On the subject of the issue. I've prototyped a small program to simulate spinlocks with lock files and the results were acceptable as a mechanism to provide synchronization across different executions of MSBuild.

But it might make more sense to fingerprint projects so msbuild can generate warnings to the user suggesting there are multiple instances of msbuild running instead of having a proper lock system in place.

In that case the design can be simplified to something like:

write fingerprint file at beginning of build
validated the file at the end of build
if the fingerprint does not match, warn user about the possibility of concurrent builds in place.

rainersigwald commented 1 month ago

It is not viable to expect the build system to enable all possible scenarios and use cases, so a little refactoring seems valid on my side.

MSBuild should (and does, in many repos) handle this situation just fine; something in the details of your implementation is biting you.

The usual approaches are the separate-project approach that @KalleOlaviNiemitalo described, or writing distinct outputs from your generator, for instance to $(IntermediateOutputPath) or a subfolder.

elisauhura commented 1 month ago

MSBuild should (and does, in many repos) handle this situation just fine; something in the details of your implementation is biting you.

MSBuild works as expected from the discussion, my case is a combo of legacy source generators and using multiple target frameworks.

Adding the locks mechanism to ensure the target execution in each TF build does not compete with each other does remove the errors but the fault is on how my source generators work in the end.

Moving them to IIncrementalGenerators seems like a better approach since there will be no custom logic at the project level anymore. Most of the work is already finished and the generated files between the different targets will not compete as they are written to different folders by design.

rainersigwald commented 1 month ago

Moving to formal Roslyn source generators will give you a better IDE experience and is a reasonable move, don't let me talk you out of that. But I still think there's likely a small tweak to the existing tooling that would resolve your races.

elisauhura commented 1 month ago

Yes, the following approaches worked fairly well:

Moving the target to before dispatch to inner build.
Disabling parallel build.
Implementing a locking system.

Moving to the generators just solves the problem with more benefits 🙂.

dotnet / msbuild

Global lock/mutex to prevent multiple MSBuild instances building the same project concurrently #9462