Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Regression(r361953): thinlto links take 3-4x as long as before #41180

Open Quuxplusone opened 5 years ago

Quuxplusone commented 5 years ago
Bugzilla Link PR42210
Status NEW
Importance P enhancement
Reported by Nico Weber (nicolasweber@gmx.de)
Reported on 2019-06-09 12:24:48 -0700
Last modified on 2019-07-02 07:17:47 -0700
Version trunk
Hardware PC Linux
CC graham.hunter@arm.com, hans@chromium.org, karl-johan.karlsson@ericsson.com, llvm-bugs@lists.llvm.org, rengolin@gmail.com, sander.desmalen@arm.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also

In Chromium, our builds using thinlto went from ~90 min to timing out after 4h after r972373. This is because verifying all IR types is very expensive apparently, see https://crbug.com/972373 comment 15.

See comment 16 for a repro case, but I'm guessing you can repro by linking any larger executable with thinlto.

Quuxplusone commented 5 years ago

I reverted r361953 in r362913 for now.

Quuxplusone commented 5 years ago

Hand picking one important comment from that thread:

perf report shows that most samples go here in the slow case:

66.70% 66.45% ld.lld lld [.] containsScalableVectorValueRecursive

It looks like the IR verifier runs even with asserts disabled, e.g. from loadModuleFromInput() in ThinLTOCodeGenerator.cpp.

Quuxplusone commented 5 years ago

https://reviews.llvm.org/D63321 should resolve the problem.

Using the tarball from the chromium bugtracker, I get about 2m40s for the link without the scalable type patch, 6m30s with, and back down to around 2m45s with the updated patch.

There seems to be a slight additional overhead on the order of a few seconds (assuming it wasn't just noise on the build machine), but nowhere near the extra minutes incurred with the first patch. I wonder if the verifier is being run for each module loaded into LTO rather than just once at the end, and is therefore re-verifying the same types pulled in from headers.

Quuxplusone commented 5 years ago

We still saw a 70% thinlto link time increase after r363658, so I've reverted again in r364543.

See https://bugs.chromium.org/p/chromium/issues/detail?id=978817#c14 for repro.

Quuxplusone commented 5 years ago
I have a new version, which removes the verifyTypes method entirely and pushes
responsibility into the isValidElementType methods for ArrayType and
StructType. The global variable checker is still there, but it didn't have to
walk through hashes.

Measurements:

** Head (144fab9)

|    Real |      User |     Sys |
|---------+-----------+---------|
| 709.160 |  4306.324 | 107.060 |
| 705.186 |  4294.984 | 109.396 |
| 696.405 |  4240.440 |  88.180 |
| 696.662 |  4244.976 |  59.532 |
| 698.887 |  4229.592 |  64.648 |
|---------+-----------+---------|
|  701.26 | 4263.2632 | 85.7632 |
|---------+-----------+---------|

** With second scalable vector IR Type patch

|      Real |     User |      Sys |
|-----------+----------+----------|
|  1320.040 | 4925.284 |   98.020 |
|  1324.306 | 4919.544 |  104.372 |
|  1321.246 | 4927.380 |  113.016 |
|  1321.696 | 4907.808 |  104.688 |
|  1318.806 | 4922.684 |  109.168 |
|-----------+----------+----------|
| 1321.2188 |  4920.54 | 105.8528 |
|-----------+----------+----------|

** With global only verifier + isValidElementType checks

|     Real |     User |      Sys |
|----------+----------+----------|
|  699.537 | 4261.944 |  105.528 |
|  702.009 | 4269.108 |   113.48 |
|  708.436 | 4299.484 |    107.1 |
|  703.904 | 4274.224 |  105.572 |
|   703.96 |  4281.44 |  107.128 |
|----------+----------+----------|
| 703.5692 |  4277.24 | 107.7616 |
|----------+----------+----------|

I think that's taken care of the problem, but given that the 9.0 branch point
is approaching it may be best to avoid recommitting before then.
Quuxplusone commented 5 years ago

Moving the checks closer to the object creation should help expose issues earlier when building with ASSERTS enabled. Given that your new patch eliminates the overhead more or less entirely and most checks are now guarded under asserts, I don't see any special reasons to wait until after the branch point (also since the patch hasn't brought up any functional issues). @Hans, do you agree? @Graham, can you please share the updated patch on Phabricator when ready?

Quuxplusone commented 5 years ago

New patch up at https://reviews.llvm.org/D64079

Quuxplusone commented 5 years ago
(In reply to Sander de Smalen from comment #6)
> I don't see any special reasons to wait until after
> the branch point (also since the patch hasn't brought up any functional
> issues). @Hans, do you agree?

Yes, no need to worry about the branch. If it's bad we can always take it out,
but so far I don't think there's any reason to worry.

I'm on vacation at the moment so can't look at the patches, but the numbers in
#5 look great to me.