Open Danielku15 opened 1 year ago
Good find, we should probably have some locking here at the layer-download-from-registry level. Using a temp file as a lock to ensure only one process is downloading is a well known technique at this point, and possibly should be how we handle this, since we can't control people publishing in parallel.
On that note, I thought newer 7 and 8 SDKs made solution level publish an error?
Good point, I should actually call publish on the correct "own" project file being built currently and not just dotnet publish
. If you look closely you will notice that I am triggering a custom target on the solution, which will then should initiate a publish on each individual project. But the bug still stays valid with this adaption.
@Danielku15 @baronfel do you plan to work on this? alternatively I can look into it. Thanks
I won't have time for a bit, so as far as I'm concerned it's up for grabs.
I didn't directly plan to focus on it. I would need to check first some references on how to reliably work with such lock files to avoid race conditions. If any of you folks have a good reference I would see if I can quickly spin up some PR addressing this problem. Might take a few days as things are quite busy on my side currently.
I think an easy way to coordinate this synchronization might be by a named mutex - the shared resources in question are
When we tried to download each layer/manifest, the MSBuild Node executing the Task would
@rainersigwald / @MichalPavlik thoughts here? other patterns we should consider? We can't directly rely on MSBuild engine caching for synchronization because the net472 pathway would be in a console app, not in an MSBuild Task.
Problem description:
It seems there is a problem with the download of blobs from the registry when building the container images when there are multiple projects in the solution and we do a publish of all projects in parallel.
A container image publish of all projects at the same time, will logically also trigger a parallel download of the base images and descriptors. As this path is fixed to
<temp>\Containers\Content\<contenthash>.<extension>
all projects will try to access the same file at the same time which leads to an exception like this:This is very unfortunate because this way you cannot have multiple container activated projects in one solution and do a publish in one go.
Example setup Our setup looks similar like this (I simplified it here, the real setup covers some more aspects):
Directory.Build.targets:
And we trigger the build like:
dotnet msbuild -target:ContainerPublish Solution.sln
The temp path is overriden in our CI system so the download will happen always.Potential fix: The download already happens to a random temp path: https://github.com/dotnet/sdk-container-builds/blob/54e9e6180f2360a9a10b38de3eb7cc3763e91152/Microsoft.NET.Build.Containers/Registry.cs#L239
This move operation likely needs some more intelligence like: If the move fails due to an
System.UnauthorizedAccessException
, we could start checking again like in line 217 if the file already exists and then we're happy. Maybe it even needs to check if the file is already available for reading (as it might be being moved right now and not yet finished).