llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.25k stars 12.08k forks source link

Regression(r361953): thinlto links take 3-4x as long as before #41555

Open nico opened 5 years ago

nico commented 5 years ago
Bugzilla Link 42210
Version trunk
OS Linux
CC @huntergr-arm,@zmodem,@karka228,@rengolin,@sdesmalen-arm

Extended Description

In Chromium, our builds using thinlto went from ~90 min to timing out after 4h after r972373. This is because verifying all IR types is very expensive apparently, see https://crbug.com/972373 comment 15.

See comment 16 for a repro case, but I'm guessing you can repro by linking any larger executable with thinlto.

zmodem commented 5 years ago

I don't see any special reasons to wait until after the branch point (also since the patch hasn't brought up any functional issues). @​Hans, do you agree?

Yes, no need to worry about the branch. If it's bad we can always take it out, but so far I don't think there's any reason to worry.

I'm on vacation at the moment so can't look at the patches, but the numbers in #​5 look great to me.

huntergr-arm commented 5 years ago

New patch up at https://reviews.llvm.org/D64079

sdesmalen-arm commented 5 years ago

Moving the checks closer to the object creation should help expose issues earlier when building with ASSERTS enabled. Given that your new patch eliminates the overhead more or less entirely and most checks are now guarded under asserts, I don't see any special reasons to wait until after the branch point (also since the patch hasn't brought up any functional issues). @​Hans, do you agree? @​Graham, can you please share the updated patch on Phabricator when ready?

huntergr-arm commented 5 years ago

I have a new version, which removes the verifyTypes method entirely and pushes responsibility into the isValidElementType methods for ArrayType and StructType. The global variable checker is still there, but it didn't have to walk through hashes.

Measurements:

** Head (144fab9)

| Real | User | Sys | |---------+-----------+---------| | 709.160 | 4306.324 | 107.060 | | 705.186 | 4294.984 | 109.396 | | 696.405 | 4240.440 | 88.180 | | 696.662 | 4244.976 | 59.532 | | 698.887 | 4229.592 | 64.648 | |---------+-----------+---------| | 701.26 | 4263.2632 | 85.7632 | |---------+-----------+---------|

** With second scalable vector IR Type patch

| Real | User | Sys | |-----------+----------+----------| | 1320.040 | 4925.284 | 98.020 | | 1324.306 | 4919.544 | 104.372 | | 1321.246 | 4927.380 | 113.016 | | 1321.696 | 4907.808 | 104.688 | | 1318.806 | 4922.684 | 109.168 | |-----------+----------+----------| | 1321.2188 | 4920.54 | 105.8528 | |-----------+----------+----------|

** With global only verifier + isValidElementType checks

| Real | User | Sys | |----------+----------+----------| | 699.537 | 4261.944 | 105.528 | | 702.009 | 4269.108 | 113.48 | | 708.436 | 4299.484 | 107.1 | | 703.904 | 4274.224 | 105.572 | | 703.96 | 4281.44 | 107.128 | |----------+----------+----------| | 703.5692 | 4277.24 | 107.7616 | |----------+----------+----------|

I think that's taken care of the problem, but given that the 9.0 branch point is approaching it may be best to avoid recommitting before then.

zmodem commented 5 years ago

We still saw a 70% thinlto link time increase after r363658, so I've reverted again in r364543.

See https://bugs.chromium.org/p/chromium/issues/detail?id=978817#c14 for repro.

huntergr-arm commented 5 years ago

https://reviews.llvm.org/D63321 should resolve the problem.

Using the tarball from the chromium bugtracker, I get about 2m40s for the link without the scalable type patch, 6m30s with, and back down to around 2m45s with the updated patch.

There seems to be a slight additional overhead on the order of a few seconds (assuming it wasn't just noise on the build machine), but nowhere near the extra minutes incurred with the first patch. I wonder if the verifier is being run for each module loaded into LTO rather than just once at the end, and is therefore re-verifying the same types pulled in from headers.

rengolin commented 5 years ago

Hand picking one important comment from that thread:

perf report shows that most samples go here in the slow case:

66.70% 66.45% ld.lld lld [.] containsScalableVectorValueRecursive

It looks like the IR verifier runs even with asserts disabled, e.g. from loadModuleFromInput() in ThinLTOCodeGenerator.cpp.

nico commented 5 years ago

I reverted r361953 in r362913 for now.

nico commented 5 years ago

assigned to @huntergr-arm