Open Quuxplusone opened 5 years ago
Hand picking one important comment from that thread:
perf report
shows that most samples go here in the slow case:
66.70% 66.45% ld.lld lld [.] containsScalableVectorValueRecursive
It looks like the IR verifier runs even with asserts disabled, e.g. from loadModuleFromInput() in ThinLTOCodeGenerator.cpp.
https://reviews.llvm.org/D63321 should resolve the problem.
Using the tarball from the chromium bugtracker, I get about 2m40s for the link without the scalable type patch, 6m30s with, and back down to around 2m45s with the updated patch.
There seems to be a slight additional overhead on the order of a few seconds (assuming it wasn't just noise on the build machine), but nowhere near the extra minutes incurred with the first patch. I wonder if the verifier is being run for each module loaded into LTO rather than just once at the end, and is therefore re-verifying the same types pulled in from headers.
We still saw a 70% thinlto link time increase after r363658, so I've reverted again in r364543.
See https://bugs.chromium.org/p/chromium/issues/detail?id=978817#c14 for repro.
I have a new version, which removes the verifyTypes method entirely and pushes
responsibility into the isValidElementType methods for ArrayType and
StructType. The global variable checker is still there, but it didn't have to
walk through hashes.
Measurements:
** Head (144fab9)
| Real | User | Sys |
|---------+-----------+---------|
| 709.160 | 4306.324 | 107.060 |
| 705.186 | 4294.984 | 109.396 |
| 696.405 | 4240.440 | 88.180 |
| 696.662 | 4244.976 | 59.532 |
| 698.887 | 4229.592 | 64.648 |
|---------+-----------+---------|
| 701.26 | 4263.2632 | 85.7632 |
|---------+-----------+---------|
** With second scalable vector IR Type patch
| Real | User | Sys |
|-----------+----------+----------|
| 1320.040 | 4925.284 | 98.020 |
| 1324.306 | 4919.544 | 104.372 |
| 1321.246 | 4927.380 | 113.016 |
| 1321.696 | 4907.808 | 104.688 |
| 1318.806 | 4922.684 | 109.168 |
|-----------+----------+----------|
| 1321.2188 | 4920.54 | 105.8528 |
|-----------+----------+----------|
** With global only verifier + isValidElementType checks
| Real | User | Sys |
|----------+----------+----------|
| 699.537 | 4261.944 | 105.528 |
| 702.009 | 4269.108 | 113.48 |
| 708.436 | 4299.484 | 107.1 |
| 703.904 | 4274.224 | 105.572 |
| 703.96 | 4281.44 | 107.128 |
|----------+----------+----------|
| 703.5692 | 4277.24 | 107.7616 |
|----------+----------+----------|
I think that's taken care of the problem, but given that the 9.0 branch point
is approaching it may be best to avoid recommitting before then.
Moving the checks closer to the object creation should help expose issues earlier when building with ASSERTS enabled. Given that your new patch eliminates the overhead more or less entirely and most checks are now guarded under asserts, I don't see any special reasons to wait until after the branch point (also since the patch hasn't brought up any functional issues). @Hans, do you agree? @Graham, can you please share the updated patch on Phabricator when ready?
New patch up at https://reviews.llvm.org/D64079
(In reply to Sander de Smalen from comment #6)
> I don't see any special reasons to wait until after
> the branch point (also since the patch hasn't brought up any functional
> issues). @Hans, do you agree?
Yes, no need to worry about the branch. If it's bad we can always take it out,
but so far I don't think there's any reason to worry.
I'm on vacation at the moment so can't look at the patches, but the numbers in
#5 look great to me.
In Chromium, our builds using thinlto went from ~90 min to timing out after 4h after r972373. This is because verifying all IR types is very expensive apparently, see https://crbug.com/972373 comment 15.
See comment 16 for a repro case, but I'm guessing you can repro by linking any larger executable with thinlto.