Open ManickaP opened 4 years ago
Some observations:
Here's a picture of our build graph:
cc @dotnet/runtime-infrastructure
The build time is highly dependent on the number of CPU cores. I wonder if your SurfaceBook has less cores than your Linux machine. Alternatively, it might be that the various recent build script changes have broken the build parallelism somehow.
It would also be interesting to try to build coreclr using the src/coreclr/build.sh while it still exists and libraries using the libraries.sh in the root folder to see if there is any difference and which of these dominate the build time.
In the past, the times have been pretty bad as well, though they seem to get worse rather than better. I do concur that HW might play a role in this (SB: CPU i7-8650U CPU @ 1.90GHz 4/8 cores; PC: CPU i7-4771 @ 3.9GHz 4/8 cores with a proper fan in a big case).
I'll try the separate builds as well, that's a good idea.
Machine | CPU | Memory | Root build | 2nd Root build | CoreCLR | Libraries | Note |
---|---|---|---|---|---|---|---|
Windows SB2 | i7-8650U @1.9GHz 4/8 cores | 16GB | 38 | 14 | 22 | 10 | per partes 6 minutes faster |
Linux PC | i7-4771 @3.9GHz 4/8 cores | 16GB | 19 | 5 | 17 | 5 | per partes 3 minutes slower |
@ManickaP could you add two more columns to the table above: CPU cores and RAM? Think that will add some context to the issue.
Based on the cpu alone this seems like a very unfair comparison.
My suggestion is to spin up two azure vms and build on them, therefore the only difference is os. I can do if interested.
@jashook I'm aware of that. My main point is: please don't make it any slower and if possible make it faster. SB is my main work machine, I have to live with those 38 minutes.
@ManickaP how often do you do a clean build and for what reasons? That might be another pivot to optimize.
It depends on what I'm doing, there're few major flows I do:
The middle scenario is probably my most often used mode of working.
Bit of an unrelated question, but I'm curious if the build process utilises all the logical threads, not just the physical cores. I remember seeing CoreCLR build script reporting "Number of cores 6" when I have a CPU that has 6 physical cores and 12 threads (i7-8700). Is it just the log that shows 6 cores and the build process utilise all 12 threads, or does the build script intentionally use only the amount of physical cores?
I am not really an expert on this but it makes me think that utilising those 6 logical threads would speed up the process. Is there a reason behind this, if this is the case?
@jkoritzinsky can explain
This affects me too. I am also mainly using a surface book. A few thoughts:
builds generally thrash my machine and make it nearly unusable, would be good if there was some way in the build script to have the build run with lower priority.
@jkoritzinsky that's what we talked about yesterday. You mentioned it's possible and we could make the fanning out conditional?
incrementally rebuilding after a merge often fails; suspect we are missing dependencies somewhere, but haven't bothered trying to figure out why. So I often end up needing to do full rebuilds.
That's because of the way Arcade builds the passed in $(ProjectToBuild)
items. We should set the StopOnFirstFailure
property in the msbuild task in Arcade via an extension point: https://github.com/dotnet/arcade/blob/master/src/Microsoft.DotNet.Arcade.Sdk/tools/Build.proj#L249. I think @akoeplinger mentioned the same thing some weeks ago.
I don't know exactly why we parallelize CL on number of cores instead of threads (that was initially done before my time) but from my experiments it got us the fastest builds doing a combination of that + max MSBuild parallelization instead of just maxing out CL parallelization across all threads.
Re making the fanning out conditional: Theoretically it's possible. We'd just have to add switches to do it and implement it in the build scripts. I don't see any technical reason we wouldn't be able to implement it.
./build.sh -subsetCategory coreclr-libraries -runtimeConfiguration Release .\build.cmd -subsetCategory coreclr-libraries -runtimeConfiguration Release
Machine | CPU | Memory | Root build | Rebuild | CoreCLR | Libraries |
---|---|---|---|---|---|---|
Azure F16_v2 (Windows Server 2019) | 2.4 GHz Intel Xeon® E5-2673 v3 16 core | 32 GB | 14.5 | 5 | 9.5 | 5.5 |
Azure F16_v2 (ubuntu 18.04) | 2.4 GHz Intel Xeon® E5-2673 v3 16 core | 32 GB | 7.5 | 2 | 3.5 | 4.5 |
Coreclr build parallelizes significantly better on unix, and runs ~3x faster. My guess is that this is because we have to fan out and fan back for each subproject we build.
Today's (https://github.com/dotnet/runtime/commit/214e2071b17905c42b76bfd66d0a9034205dd928) numbers are 45 minutes for clean build build -subset clr+libs -runtimeConfiguration Release
. I guess it has something to do with the recent work on restore. 😢
I think it would be nice if the coreclr component can come pre-built or something (just like what we did in corefx pre-consolidation / live-live build change) for those who only work on the libraries component. Building the whole runtime just to change a couple lines on the library part feels like too heavy handed....
You only need to build coreclr to run tests. To build and get yourself unblocked you can just build -subset coreclr.corelib
.
I think it would be nice if the coreclr component can come pre-built or something One of the reasons why we merged the repos was to have a coherent up to date version and avoid having to wait for an official build to publish a version of coreclr, manage breaks, etc. Is something we could consider as an opt-in option.
However I think we should strive towards building faster, we're doing investigations to improve that.
Today's (214e207) numbers are 45 minutes for clean build build -subset clr+libs -runtimeConfiguration Release. I guess it has something to do with the recent work on restore.
@ManickaP I wonder if your nuget cache was not populated and you had to hit the network for all the packages, also note that we removed some of the restore sources and are relying on Azure feeds which have throttled us and also have been having some hicups. It would be good to clean your repo and trying again to see how it goes. Also, building just libraries (first coreclr or just corelib) and then measure the build time for just libraries, in CI, and others machine the restore change made the build faster because we changed the way we restored.
Would be interesting if you could do a comparison with build -subset mono+libs -runtimeConfiguration Release
, mono builds a lot faster than coreclr usually.
Yesterday, I did clean build (after git clean -dfx
).
I did some remeasuring today. It took 35 minutes today (clean build). The additional time can be traced back to write timestamps of my nuget packages (as @safern suggested) which spans about 10 minutes:
08.04.2020 17:41:01 .nuget\packages\microsoft.dotnet.arcade.sdk\5.0.0-beta.20201.2
...
08.04.2020 17:50:45 .nuget\packages\microsoft.windowsdesktop.app.runtime.win-x86\5.0.0-preview.4.20180.8
I also noticed that the initial download done by dotnet-install.ps1
(~180MB) takes about 8 minutes. While downloading the same file via browser takes about 2 minutes. So I tried it with Invoke-WebRequest
with $ProgressPreference = 'SilentlyContinue'
which gave me around 1.5 minutes. So I think there might be a room for improvement in the install script. I guess I could file an issue in https://github.com/dotnet/sdk.
Also note that the time spent in dotnet-install.ps1
is not included in the final report of the build (~25 minutes) while it takes quite large chunk of the whole process.
@akoeplinger mono build is quite faster! 👍 22 minutes including the initial download, 12 minutes reported by the build. Since the libraries themselves take around 10 minutes it means that mono is build much much faster than clr! I'd have to rerun it for just runtimes but my guesstimate is that mono build is 5-7 times faster than clr.
So I tried it with Invoke-WebRequest with $ProgressPreference = 'SilentlyContinue' which gave me around 1.5 minutes. So I think there might be a room for improvement in the install script. I guess I could file an issue in https://github.com/dotnet/sdk.
I don't know what ProgressReference = 'SilentlyContinue
will do, but yeah it might be worth opening an issue there to improve those times.
By the way, if you install the same version of dotnet SDK that we specify here in your machine and it is on the PATH, the build will skip the install step, so that is another workaround at the moment for you.
Does the workaround of just building corelib works for you whenever you don't want to run tests?
Whenever you want to run tests you can just run: build.cmd -subset clr.runtime+clr.nativecorelib+libs.pretest -runtimeConfiguration <Config>
and that will set you up to run tests. libs.pretest
will update the runtime in the testhost. That is also a way to work on coreclr or corelib without rebuilding all of libraries. You can change whatever, rebuild coreclr or corelib only and then run libs.pretest
before running the tests.
Also, are you only building coreclr/mono and libraries, or are you building the whole repo without specifying any subsets?
ProgressReference = 'SilentlyContinue
means it will to report progress which considerably slows down the download. It already is in use for the download dotnet-install.ps1(105).
if you install the same version of dotnet SDK that we specify here in your machine and it is on the PATH
I'll try that, that seems promising :+1:
I do build only clr and libs, i.e.: build -subset clr+libs -runtimeConfiguration Release
. I do run libraries tests all the time, but I'll try the subset you're suggesting (clr.runtime+clr.nativecorelib+libs.pretest
).
Thanks a lot!
On current master (https://github.com/dotnet/runtime/commit/7f52377bbcbb90e7b104e97505866320accae060) on Windows 10.0.19546 prerelease 200110-1443, SurfaceBook 2 13'' (Surface_Book_1832) it took 38 minutes to run a clean build (
build -subsetCategory coreclr-libraries -runtimeConfiguration Release
). The task itself reports onlyTime Elapsed 00:31:22.36
, my guess is that the additional time is spent on restoring the build tools.I ran the same on my personal Linux machine (PC not a notebook; Manjaro 19.0.2 Kyria kernel x86_64 Linux 5.5.7-1; Intel Core i7-4771 @ 8x 3.9GHz) and it took 19 minutes, reported 15:44:45.
Note that my PC is ~6yo, mid-level budget, custom build machine, nothing fancy.
I can share binlogs from both runs, but they're too big to attach here.