Open joperator opened 9 months ago
@joperator, what does typelib ID {5477469e-83b1-11d2-8b49-00a0c9b7c9c4}
resolve to on the problematic machine? Do any of the projects you're building write the file or the relevant registration?
Perhaps the image file is locked by an MSBuild process while another one is trying to find its resource section.
I've tried locking the .tlb and 1) its location makes it hard to lock it for writing, 2) locking it for reading does not seem to be causing the exception you're seeing.
According to the .ResolveComReference.cache file, the typelib ID resolves to C:\Windows\Microsoft.NET\Framework\v4.0.30319\mscoree.tlb, which is also the same path that is in the registry. The projects don't write the file (date modified: 08.05.2021) or the relevant registration. The COM reference is only used to create an instance of the CorRuntimeHost coclass from the mscoree
namespace to call the GetDefaultDomain method.
Another detail that might be relevant: The projects that are built are located on a different drive (D:) than the type library mscoree.tlb (C:).
It looks like ResolveComReference.Execute passes only the Exception.Message to the logging function, so the MSBUILDDIAGNOSTICS
environment variable won't make it show the stack trace from which the exception was thrown.
Are multiple agents using the same TEMP directory in the same computer? Perhaps the interop assembly is generated there and parallel accesses cause a conflict.
All agents are running under the same user account, so I assume they are using the same TEMP directory. If they want to create an interop assembly in the same location at the same time, it's likely that they cause a conflict.
I believe that interop assemblies should be created in the per-project/config/TFM intermediate directory. Do you think you can configure the builds to produce binlogs (/bl
) to analyze the builds next time this happens?
Sure, if it helps to analyze the issue, I'll give binlogs a try...
@ladipro
I now have a binlog from a failed build. For privacy reasons, I had to copy the subtree of the failed ResolveComReferences
target and replace all private information. The failing project that has the COM reference to mscoree is now called MyProject in the MyProject.log.
Thank you. The error is thrown after all the
Processing COM reference "mscoree" from path "C:\Windows\Microsoft.NET\Framework\v4.0.30319\mscoree.tlb". Type '<typename>' imported.
log output, which confirms that the task was able to read the .tlb and the error is really related to the interop assembly (the file written by the task). I guess this brings us back to Kalle's suspicion that multiple builds are racing to write the same file. Where is the interop assembly generated when the build succeeds?
Process Monitor could be helpful for logging any STATUS_SHARING_VIOLATION or STATUS_ACCESS_DENIED errors during the build.
Where is the interop assembly generated when the build succeeds?
When the build succeeds, the binlog contains the following lines instead:
...
Processing COM reference "mscoree" from path "C:\Windows\Microsoft.NET\Framework\v4.0.30319\mscoree.tlb". Type 'TypeNameFactory' imported.
Resolved COM reference for item "mscoree": "obj\Release\net48\Interop.mscoree.dll".
Assuming this is really a project-relative directory and there's no way multiple agents can access the same path, I'm afraid this will require some instrumentation to figure out what's holding the file locked. Anti-virus software tends to be problematic so maybe one random idea is to try disabling it if present.
Assuming this is really a project-relative directory and there's no way multiple agents can access the same path, ...
Yes, I also assume that.
Anti-virus software tends to be problematic so maybe one random idea is to try disabling it if present.
No anti-virus software is present on the affected system, or I don't have sufficient permissions to see it, but I really don't think there is any anti-virus software other than the default Windows security tools installed. So follow Kalle's advice and give Process Monitor a try?
So follow Kalle's advice and give Process Monitor a try?
Yes, that's probably the easiest thing to do now.
Issue Description
I have a solution with more than 40 .NET projects built for different target frameworks such as
netstandard2.0
,net472
,net48
,net6.0
andnet7.0
. One of the .csproj files contains the following reference:The solution is built in an Azure DevOps Pipeline on self-hosted Windows agents with MSBuild. The Windows machine hosts multiple Azure Pipelines Agents. If only one or two of them are enabled, the build always succeeds. However, if all six of them are enabled and used concurrently, the build regularly fails with the following error:
The error message isn't useful because the specified mscoree image file hasn't changed between a successful and an unsuccessful build, so it should contain the required resource section.
Steps to Reproduce
The solution is build with the following invocation from a Python script:
Expected Behavior
The builds should succeed regardless of how many agents are enabled on the Windows machine.
Actual Behavior
With too many agents, e.g. six, enabled and used concurrently on the Windows machine, the build regularly fails with error MSB3303.
Analysis
The error message that the specified mscoree image file did not contain a resource section indicates that MSBuild selects the wrong image file or is unable to determine whether it contains a resource section when multiple builds are running concurrently on the same Windows machine. Perhaps the image file is locked by an MSBuild process while another one is trying to find its resource section. The resulting IOException could then be caught to prevent an abort and replaced with an error message stating that the image file did not contain a resource section, although it was just not possible to determine if this was the case.
Versions & Configurations
Visual Studio Enterprise 2022 version 17.8.3 is installed on the Windows machine. The Windows edition is Windows Server 2022 Standard version 21H2.
dotnet is also installed: