dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.09k stars 4.7k forks source link

Clean build takes too long #33510

Open ManickaP opened 4 years ago

ManickaP commented 4 years ago

On current master (https://github.com/dotnet/runtime/commit/7f52377bbcbb90e7b104e97505866320accae060) on Windows 10.0.19546 prerelease 200110-1443, SurfaceBook 2 13'' (Surface_Book_1832) it took 38 minutes to run a clean build (build -subsetCategory coreclr-libraries -runtimeConfiguration Release). The task itself reports only Time Elapsed 00:31:22.36, my guess is that the additional time is spent on restoring the build tools.

I ran the same on my personal Linux machine (PC not a notebook; Manjaro 19.0.2 Kyria kernel x86_64 Linux 5.5.7-1; Intel Core i7-4771 @ 8x 3.9GHz) and it took 19 minutes, reported 15:44:45.

Note that my PC is ~6yo, mid-level budget, custom build machine, nothing fancy.

I can share binlogs from both runs, but they're too big to attach here.

ViktorHofer commented 4 years ago

Some observations:

Here's a picture of our build graph:

image

cc @dotnet/runtime-infrastructure

janvorli commented 4 years ago

The build time is highly dependent on the number of CPU cores. I wonder if your SurfaceBook has less cores than your Linux machine. Alternatively, it might be that the various recent build script changes have broken the build parallelism somehow.

janvorli commented 4 years ago

It would also be interesting to try to build coreclr using the src/coreclr/build.sh while it still exists and libraries using the libraries.sh in the root folder to see if there is any difference and which of these dominate the build time.

ManickaP commented 4 years ago

In the past, the times have been pretty bad as well, though they seem to get worse rather than better. I do concur that HW might play a role in this (SB: CPU i7-8650U CPU @ 1.90GHz 4/8 cores; PC: CPU i7-4771 @ 3.9GHz 4/8 cores with a proper fan in a big case).

I'll try the separate builds as well, that's a good idea.

ManickaP commented 4 years ago

Build times in minutes:

Machine CPU Memory Root build 2nd Root build CoreCLR Libraries Note
Windows SB2 i7-8650U @1.9GHz 4/8 cores 16GB 38 14 22 10 per partes 6 minutes faster
Linux PC i7-4771 @3.9GHz 4/8 cores 16GB 19 5 17 5 per partes 3 minutes slower
jaredpar commented 4 years ago

@ManickaP could you add two more columns to the table above: CPU cores and RAM? Think that will add some context to the issue.

jashook commented 4 years ago

Based on the cpu alone this seems like a very unfair comparison.

jashook commented 4 years ago

My suggestion is to spin up two azure vms and build on them, therefore the only difference is os. I can do if interested.

ManickaP commented 4 years ago

@jashook I'm aware of that. My main point is: please don't make it any slower and if possible make it faster. SB is my main work machine, I have to live with those 38 minutes.

ViktorHofer commented 4 years ago

@ManickaP how often do you do a clean build and for what reasons? That might be another pivot to optimize.

ManickaP commented 4 years ago

It depends on what I'm doing, there're few major flows I do:

The middle scenario is probably my most often used mode of working.

Gnbrkm41 commented 4 years ago

Bit of an unrelated question, but I'm curious if the build process utilises all the logical threads, not just the physical cores. I remember seeing CoreCLR build script reporting "Number of cores 6" when I have a CPU that has 6 physical cores and 12 threads (i7-8700). Is it just the log that shows 6 cores and the build process utilise all 12 threads, or does the build script intentionally use only the amount of physical cores?

I am not really an expert on this but it makes me think that utilising those 6 logical threads would speed up the process. Is there a reason behind this, if this is the case?

jashook commented 4 years ago

@jkoritzinsky can explain

AndyAyersMS commented 4 years ago

This affects me too. I am also mainly using a surface book. A few thoughts:

ViktorHofer commented 4 years ago

builds generally thrash my machine and make it nearly unusable, would be good if there was some way in the build script to have the build run with lower priority.

@jkoritzinsky that's what we talked about yesterday. You mentioned it's possible and we could make the fanning out conditional?

ViktorHofer commented 4 years ago

incrementally rebuilding after a merge often fails; suspect we are missing dependencies somewhere, but haven't bothered trying to figure out why. So I often end up needing to do full rebuilds.

That's because of the way Arcade builds the passed in $(ProjectToBuild) items. We should set the StopOnFirstFailure property in the msbuild task in Arcade via an extension point: https://github.com/dotnet/arcade/blob/master/src/Microsoft.DotNet.Arcade.Sdk/tools/Build.proj#L249. I think @akoeplinger mentioned the same thing some weeks ago.

jkoritzinsky commented 4 years ago

I don't know exactly why we parallelize CL on number of cores instead of threads (that was initially done before my time) but from my experiments it got us the fastest builds doing a combination of that + max MSBuild parallelization instead of just maxing out CL parallelization across all threads.

Re making the fanning out conditional: Theoretically it's possible. We'd just have to add switches to do it and implement it in the build scripts. I don't see any technical reason we wouldn't be able to implement it.

jashook commented 4 years ago

Build command

./build.sh -subsetCategory coreclr-libraries -runtimeConfiguration Release .\build.cmd -subsetCategory coreclr-libraries -runtimeConfiguration Release

Build times in minutes:

Machine CPU Memory Root build Rebuild CoreCLR Libraries
Azure F16_v2 (Windows Server 2019) 2.4 GHz Intel Xeon® E5-2673 v3 16 core 32 GB 14.5 5 9.5 5.5
Azure F16_v2 (ubuntu 18.04) 2.4 GHz Intel Xeon® E5-2673 v3 16 core 32 GB 7.5 2 3.5 4.5

Observations

Coreclr build parallelizes significantly better on unix, and runs ~3x faster. My guess is that this is because we have to fan out and fan back for each subproject we build.

ManickaP commented 4 years ago

Today's (https://github.com/dotnet/runtime/commit/214e2071b17905c42b76bfd66d0a9034205dd928) numbers are 45 minutes for clean build build -subset clr+libs -runtimeConfiguration Release. I guess it has something to do with the recent work on restore. 😢

Gnbrkm41 commented 4 years ago

I think it would be nice if the coreclr component can come pre-built or something (just like what we did in corefx pre-consolidation / live-live build change) for those who only work on the libraries component. Building the whole runtime just to change a couple lines on the library part feels like too heavy handed....

safern commented 4 years ago

You only need to build coreclr to run tests. To build and get yourself unblocked you can just build -subset coreclr.corelib.

I think it would be nice if the coreclr component can come pre-built or something One of the reasons why we merged the repos was to have a coherent up to date version and avoid having to wait for an official build to publish a version of coreclr, manage breaks, etc. Is something we could consider as an opt-in option.

However I think we should strive towards building faster, we're doing investigations to improve that.

Today's (214e207) numbers are 45 minutes for clean build build -subset clr+libs -runtimeConfiguration Release. I guess it has something to do with the recent work on restore.

@ManickaP I wonder if your nuget cache was not populated and you had to hit the network for all the packages, also note that we removed some of the restore sources and are relying on Azure feeds which have throttled us and also have been having some hicups. It would be good to clean your repo and trying again to see how it goes. Also, building just libraries (first coreclr or just corelib) and then measure the build time for just libraries, in CI, and others machine the restore change made the build faster because we changed the way we restored.

akoeplinger commented 4 years ago

Would be interesting if you could do a comparison with build -subset mono+libs -runtimeConfiguration Release, mono builds a lot faster than coreclr usually.

ManickaP commented 4 years ago

Yesterday, I did clean build (after git clean -dfx).

I did some remeasuring today. It took 35 minutes today (clean build). The additional time can be traced back to write timestamps of my nuget packages (as @safern suggested) which spans about 10 minutes:

08.04.2020 17:41:01  .nuget\packages\microsoft.dotnet.arcade.sdk\5.0.0-beta.20201.2
...
08.04.2020 17:50:45  .nuget\packages\microsoft.windowsdesktop.app.runtime.win-x86\5.0.0-preview.4.20180.8

I also noticed that the initial download done by dotnet-install.ps1 (~180MB) takes about 8 minutes. While downloading the same file via browser takes about 2 minutes. So I tried it with Invoke-WebRequest with $ProgressPreference = 'SilentlyContinue' which gave me around 1.5 minutes. So I think there might be a room for improvement in the install script. I guess I could file an issue in https://github.com/dotnet/sdk.

Also note that the time spent in dotnet-install.ps1 is not included in the final report of the build (~25 minutes) while it takes quite large chunk of the whole process.

ManickaP commented 4 years ago

@akoeplinger mono build is quite faster! 👍 22 minutes including the initial download, 12 minutes reported by the build. Since the libraries themselves take around 10 minutes it means that mono is build much much faster than clr! I'd have to rerun it for just runtimes but my guesstimate is that mono build is 5-7 times faster than clr.

safern commented 4 years ago

So I tried it with Invoke-WebRequest with $ProgressPreference = 'SilentlyContinue' which gave me around 1.5 minutes. So I think there might be a room for improvement in the install script. I guess I could file an issue in https://github.com/dotnet/sdk.

I don't know what ProgressReference = 'SilentlyContinue will do, but yeah it might be worth opening an issue there to improve those times.

By the way, if you install the same version of dotnet SDK that we specify here in your machine and it is on the PATH, the build will skip the install step, so that is another workaround at the moment for you.

Does the workaround of just building corelib works for you whenever you don't want to run tests?

Whenever you want to run tests you can just run: build.cmd -subset clr.runtime+clr.nativecorelib+libs.pretest -runtimeConfiguration <Config> and that will set you up to run tests. libs.pretest will update the runtime in the testhost. That is also a way to work on coreclr or corelib without rebuilding all of libraries. You can change whatever, rebuild coreclr or corelib only and then run libs.pretest before running the tests.

Also, are you only building coreclr/mono and libraries, or are you building the whole repo without specifying any subsets?

ManickaP commented 4 years ago

ProgressReference = 'SilentlyContinue means it will to report progress which considerably slows down the download. It already is in use for the download dotnet-install.ps1(105).

if you install the same version of dotnet SDK that we specify here in your machine and it is on the PATH

I'll try that, that seems promising :+1:

I do build only clr and libs, i.e.: build -subset clr+libs -runtimeConfiguration Release. I do run libraries tests all the time, but I'll try the subset you're suggesting (clr.runtime+clr.nativecorelib+libs.pretest).

Thanks a lot!