dotnet / msbuild

The Microsoft Build Engine (MSBuild) is the build platform for .NET and Visual Studio.
https://docs.microsoft.com/visualstudio/msbuild/msbuild
MIT License
5.22k stars 1.35k forks source link

Lack of tooling or documentation on how to optimize build of 3000 F# projects, where cpu is not saturated #5516

Open michalmalecki opened 4 years ago

michalmalecki commented 4 years ago

Creating this issue as i was not able to find answer or get response on StackOverflow. Im building 3000 F# projects in our CI, from scratch, on TeamCity. Perflog tab shows that during 2 hours it lasts, cpu is saturated only half of the time. This happens with VS2017 on windows server. Im not after optimizing cpu usage, but rather understanding what msbuild agents are waiting for and if i could optimize dependency chain or smth else. I tried enabling binlog, but msbuild hangs then and i cant access Build agents. I tried sorting projects by number of dependants, so we build core project first (i have not tried -graph option yet) F# is compiling files sequentially so may be more sensitive to thus problem than c# or cpp. Have i missed docs or tools that would allow to optimize my build? Expected: report and hints on what is causing delays in build.

benvillalobos commented 4 years ago

This seems like an area where build logs would help the most. Here are some options off the top of my head.

If you have access to the machines running your CI, examining the node communication traces might help.

For getting the most CPU usage possible, set the -m:<cpu_count> flag. If no cpu_count is set but the -m switch is passed, MSBuild will use as many as possible. Perhaps this will allow you to complete a build and capture build logs?

benvillalobos commented 4 years ago

Another idea (which may be the most useful) is to take a look at the node utilization graph. It's part of the diagnostic output when building. Try passing -detailedsummary or -ds on the command line. https://docs.microsoft.com/en-us/visualstudio/msbuild/msbuild-command-line-reference?view=vs-2019#switches

Look for long sections where many nodes are idle (marked with x).

michalmalecki commented 4 years ago

Thank you very much for answers. Just a little more context - we already run with /m, I have experimented with different loggers and settled on ConsoleLogger redirected to file, there was no measurable benefit from distributed file logger. For -ds I would need a little help reading it. I see 32 columns (for available cores I guess) and average utilization of 83%. What I can do if I see particular time slot where most of nodes are idle (have Xs)? How is it actionable? I will try to access MSBUILDDEBUGPATH but it will take me some time. Looking at example log after compiling simple 3 projects, does it provide info on nodes waiting for their dependencies to build and idling?

benvillalobos commented 4 years ago

After talking with the team, it was made clear that looking at the node communication traces wouldn't actually be super useful here. Apologies for the red herring!

I found an article that I had forgotten to link here: https://devblogs.microsoft.com/visualstudio/msbuild-4-detailed-build-summary/

Here's what the article suggests doing with these findings.

If you look at your graph and you see one node doing work while no others are and the total duration of that period is long, then that is an indication you have serialization in your build and it may be worth looking at whether that request really should be that long – can it be split up into smaller chunks and have other requests refer to it piecemeal? Can the project itself be made to build faster using better tools? Is the request doing something unexpected? Another thing you can experiment with when trying to tune your builds is changing the multi-proc node count limit

Let us know if this helps!

michalmalecki commented 4 years ago

Thanks a lot, i started looking at ds and found one project that is a bottleneck. Im experimenting if changing order of the projects that im providing to msbuild can hint it to prioritize hotpath over other projects. I would imagine graphbuild can do something like this in the future.

michalmalecki commented 4 years ago

I didnt have as much time as i wanted but already run one experiment that failed, could you or smb from product team validates its worth further investigation? I found out that one project, say "S" is blocking number of projects from building, and is itself built late in the process (around 36 minute, out of 40). It has around 12 dependencies, but if its built alone it completes much faster than in 36 minutes. My theory is that a lot of other projects, which may be leaf projects, built first, pushing"S" to the end of the queue. Can there be done anything to improve such situation? My experiment: I sorted projects, that are passed to msbuild, by number of their dependants. This way leaf projects should be built last. However i didnt see any impact on time or cpu utilization. I need to still check what was actual order in which projects were built. Do you think such approach is worth pursuing?

benvillalobos commented 4 years ago

I am curious, how did you sort these projects such that they would be built in that order?

You're certainly on the right track. A logical next step would be to make sure your projects are actually building in the order you want. I believe our docs on Project References and extending the build process are relevant here.

michalmalecki commented 4 years ago

Hi, I'm manipulating (sorting) list of projects that I'm passing to msbuild task, i.e.

Simple test at home (but with VS 2019 instead of 2017) shows that msbuild is building projects in the order that I'm passing in the Projects attribute.