Closed jachin84 closed 5 years ago
Can you capture diagnostic-level logs of both build scenarios and post the performance summary portions, especially Project Performance Summary
?
Is there a private mechanism by which I can share the logs?
Yes: file Visual Studio feedback and add attachments there. When I do so, I see this text:
Your attachments will be private to Microsoft. Your user name, title, and description will be public on Developer Community. For more on privacy, see the Microsoft Privacy Statement.
(and then come back here and link to the feedback item so we don't have to wait for routing, please)
I added binlogs for both the slow and fast run.
Thank you! From that log, I can see the problem: you're setting _PackageTempDir
as a global property for only a subset of projects. Those projects have ProjectReference
s that point to other projects, and when those references are followed, the _PackageTempDir
global property is inherited. The MSBuild engine treats projects with distinct global properties as separate things (this is so you can for instance build Debug and Release of the same project simultaneously), which causes many/most of your projects to be built multiple times.
There are a few ways to avoid this:
_PackageTempDir
within individual projects, rather than as a global property for those projects._PackageTempDir
property uniformly so that all projects have the same value for it.msbuild.exe /p:_PackageTempDir=whatever
ProjectReference
s, by setting the UndefineProperties
metadata to _PackageTempDir
_for all ProjectReference
items in your tree (easiest via an ItemDefinitionGroup
in a Directory.Build.props
).Also, side note: this build looks big enough that it would be worth spending some time making sure multiproc builds work for you; that could speed up builds significantly.
Thanks! I really appreciate your feedback. Let me see if I understand you correctly. By setting _PackageTempDir
via the AdditionalProperties
metadata those properties become global properties which are in turn inherited by any referenced projects.
I am building Web, Console, Service, Desktop and Test projects all in one go. Most of my project will at some point reference a Global project which I'll use as an example. If I build Web projects with a certain set of AdditionalProperties
and then Console projects with a slightly different set of AdditionalProperties
this will eventually lead to the Global project being built many times. I am correct in say this is referred to batching?
What I'm finding a little confusing is my Web projects always have a different set of global properties but it's only when the _PackageTempDir
property is set that things start slowing down.
All projects that get built get a standard set of metadata like so:
<ItemDefinitionGroup>
<ProjectsToBuild>
<Configuration>$(Configuration)</Configuration>
<Platform>$(Platform)</Platform>
<TargetToBuild>Build</TargetToBuild>
</ProjectsToBuild>
</ItemDefinitionGroup>
Web projects are update with a different Target/AdditionalProperties
.
<ItemGroup>
<ProjectsToBuild Include="@(WebProjectsToBuild)">
<TargetToBuild>PipelinePreDeployCopyAllFilesToOneFolder</TargetToBuild>
<AdditionalProperties>AutoParameterizationWebConfigConnectionStrings=false;</AdditionalProperties>
</ProjectsToBuild>
<ProjectsToBuild Include="@(ConsoleProjectToBuild);@(ServicesToBuild)" />
</ItemGroup>
In any case _PackageTempDir
is only relevant for web projects so I suppose these is no harm setting it globally. I've also got a mechanism to add it only to the web projects via an extension point that will import ProjectName.target
in my build folder automatically. I'll experiment with what works best.
What is UndefineProperties metadata? I can't find any documentation on it anywhere.
If I build Web projects with a certain set of
AdditionalProperties
and then Console projects with a slightly different set ofAdditionalProperties
this will eventually lead to the Global project being built many times. I am correct in say this is referred to batching?
No, this is distinct from batching, which is when a single target or task runs multiple times because of distinct metadata.
This is the MSBuild engine's deduplication of projects to build, which isn't super well documented but is mentioned here a bit.
The idea is that a project should only be built once--just because you have references from app1.csproj
and app2.csproj
to utilities.csproj
doesn't mean that utilities.dll
should be built twice. But that rule is a bit too strict; it doesn't take into account various types of project configuration, like Configuration
(Debug/Release), Platform
(x86/x64/anyCPU), or TargetFramework
(netstandard2.0/net472). On each of those, you might need to build the same project multiple times with different configurations. MSBuild's approach to this is to treat each (Project path, set of global properties) as distinct.
This approach works but causes the subtle problem you're seeing: if a project is built with distinct global properties but doesn't actually produce different output, it will be built more than once. In /m
builds, this is a race condition that can fail builds; in single-proc builds it usually just wastes time.
What I'm finding a little confusing is my Web projects always have a different set of global properties but it's only when the
_PackageTempDir
property is set that things start slowing down.
I think this is because you're setting this new metadatum to something unique per WebProject with _PackageTempDir=$(OutputRoot)\%(Filename)
. That means that instead of one additional build for each project (because of the AutoParameterizationWebConfigConnectionStrings
property), you now get N, starting in a tree from the N web projects.
Looking through the logs you shared, ShortBuild built a total of 827 projects, while LongBuild built 1658 projects. Both had 138 unique project paths.
What is UndefineProperties metadata? I can't find any documentation on it anywhere.
It is extremely underdocmented at the moment. I'm writing documentation but haven't finished. Current draft is at https://gist.github.com/rainersigwald/1fbd21a24a41b9b7c6204cb9cfcbb1cf.
Ah of course, I forgot about the %(Filename)
at the end. That’s a really good explanation, thank you.
This also explain why I’ve not been able to get parallel builds to work consistently. Is there any way in the logs to easily identify the duplicate building as a result of deduplication see projects as distinct?
Is there any way in the logs to easily identify the duplicate building as a result of deduplication see projects as distinct?
It's fairly difficult in the standard text logs. I have a logger that can help: https://github.com/rainersigwald/ParallelBuildDebuggingLogger
In its output, lines that say Reentering project
are deduplicated, while lines that say Project {} built by project
are new instances.
This logger is very useful to me, the person who wrote it. If you try it, I'd appreciate feedback; I know it's not intuitive to use and entirely undocumented.
Nice idea. I modified it a bit to produce a file I could drop into Graphziv. Any vertices with more than one edge are project being build twice. I think...
Very surprising how many projects are built a couple of times.
I’m starting to think that using a single msbuild task do build web/console/service/desktop projects just isn’t the way to go. There are too many properties that differ. I’m trying to think of a way to pass properties to the projects without making them global.
Ok so this seems a little odd. I am using the following in a Directory.Build.props which fails every time because of concurrency issues.
<ItemDefinitionGroup>
<ProjectReference>
<UndefineProperties>
_PackageTempDir;AutoParameterizationWebConfigConnectionStrings
</UndefineProperties>
</ProjectReference>
</ItemDefinitionGroup>
If I do the following it work:
<ItemDefinitionGroup>
<ProjectReference>
<UndefineProperties>_PackageTempDir;AutoParameterizationWebConfigConnectionStrings</UndefineProperties>
</ProjectReference>
</ItemDefinitionGroup>
Not the only difference is splitting UndefineProperties
on different lines.
Not the only difference is splitting
UndefineProperties
on different lines.
Nice find! I filed Microsoft/msbuild#4014 to track.
I’m trying to think of a way to pass properties to the projects without making them global.
Do you really need to pass them? The usual MSBuild approach to this sort of problem is either:
Of course if you opt for the latter you have to be super careful about global property inheritance again.
Thanks again for your help. Can you elaborate at all? I suspect there is a large gap in my understanding when it comes to building many projects at once.
Recompute them identically in each project
What do you mean by recompute? I am assuming you mean set the required properties in each project directly avoiding the need to use Properties/AdditionalProperties that results in global properties.
Compute them in a single project and ask for the results via MSBuild task in other projects.
Again not really sure what you mean here?
Feel free to close this issue. I'm still curious about the two questions above if you have a moment to provide more details?
Recompute them identically in each project
What do you mean by recompute? I am assuming you mean set the required properties in each project directly avoiding the need to use Properties/AdditionalProperties that results in global properties.
Yes, that's what I mean. If you set the properties in a common include file (maybe a Directory.Build.props
), you can use the same definition for every project without using any global properties.
Things that may help with this approach:
$(MSBuildThisPathDirectory)
and $([MSBuild]::GetPathOfFileAbove())
can help anchor relative paths to one place, regardless of Compute them in a single project and ask for the results via MSBuild task in other projects.
Again not really sure what you mean here?
This is much more complicated, but I'm thinking something along the lines of
<Target Name="GetCentralProperties" BeforeTargets="BeforeBuild">
<MSBuild Project="CentralComputationLocation.proj"
Targets="ExpensiveCalculation">
<Output TaskParameter="TargetOutputs"
PropertyName="CentrallyComputedProperty" />
</MSBuild>
</Target>
I'm not really sure where is the best place to ask this question. Feel free to close this issue if this is not the appropriate place.
I have a proj file with something like this:
To my surprise specifying _PackageTempDir or PackageLocation causes the build to run much slower. On my dev machine (which is very, very fast) it takes roughly 45 sec longer. On our build server, which is quite a bit slower, it takes almost 5 minutes longer. Surely specifying an alternative PackageLocation should not add such a delay? Any ideas?