Open nico opened 5 years ago
I don't see any special reasons to wait until after the branch point (also since the patch hasn't brought up any functional issues). @Hans, do you agree?
Yes, no need to worry about the branch. If it's bad we can always take it out, but so far I don't think there's any reason to worry.
I'm on vacation at the moment so can't look at the patches, but the numbers in #5 look great to me.
New patch up at https://reviews.llvm.org/D64079
Moving the checks closer to the object creation should help expose issues earlier when building with ASSERTS enabled. Given that your new patch eliminates the overhead more or less entirely and most checks are now guarded under asserts, I don't see any special reasons to wait until after the branch point (also since the patch hasn't brought up any functional issues). @Hans, do you agree? @Graham, can you please share the updated patch on Phabricator when ready?
I have a new version, which removes the verifyTypes method entirely and pushes responsibility into the isValidElementType methods for ArrayType and StructType. The global variable checker is still there, but it didn't have to walk through hashes.
Measurements:
** Head (144fab9)
| Real | User | Sys | |---------+-----------+---------| | 709.160 | 4306.324 | 107.060 | | 705.186 | 4294.984 | 109.396 | | 696.405 | 4240.440 | 88.180 | | 696.662 | 4244.976 | 59.532 | | 698.887 | 4229.592 | 64.648 | |---------+-----------+---------| | 701.26 | 4263.2632 | 85.7632 | |---------+-----------+---------|
** With second scalable vector IR Type patch
| Real | User | Sys | |-----------+----------+----------| | 1320.040 | 4925.284 | 98.020 | | 1324.306 | 4919.544 | 104.372 | | 1321.246 | 4927.380 | 113.016 | | 1321.696 | 4907.808 | 104.688 | | 1318.806 | 4922.684 | 109.168 | |-----------+----------+----------| | 1321.2188 | 4920.54 | 105.8528 | |-----------+----------+----------|
** With global only verifier + isValidElementType checks
| Real | User | Sys | |----------+----------+----------| | 699.537 | 4261.944 | 105.528 | | 702.009 | 4269.108 | 113.48 | | 708.436 | 4299.484 | 107.1 | | 703.904 | 4274.224 | 105.572 | | 703.96 | 4281.44 | 107.128 | |----------+----------+----------| | 703.5692 | 4277.24 | 107.7616 | |----------+----------+----------|
I think that's taken care of the problem, but given that the 9.0 branch point is approaching it may be best to avoid recommitting before then.
We still saw a 70% thinlto link time increase after r363658, so I've reverted again in r364543.
See https://bugs.chromium.org/p/chromium/issues/detail?id=978817#c14 for repro.
https://reviews.llvm.org/D63321 should resolve the problem.
Using the tarball from the chromium bugtracker, I get about 2m40s for the link without the scalable type patch, 6m30s with, and back down to around 2m45s with the updated patch.
There seems to be a slight additional overhead on the order of a few seconds (assuming it wasn't just noise on the build machine), but nowhere near the extra minutes incurred with the first patch. I wonder if the verifier is being run for each module loaded into LTO rather than just once at the end, and is therefore re-verifying the same types pulled in from headers.
Hand picking one important comment from that thread:
perf report
shows that most samples go here in the slow case:
66.70% 66.45% ld.lld lld [.] containsScalableVectorValueRecursive
It looks like the IR verifier runs even with asserts disabled, e.g. from loadModuleFromInput() in ThinLTOCodeGenerator.cpp.
I reverted r361953 in r362913 for now.
assigned to @huntergr-arm
Extended Description
In Chromium, our builds using thinlto went from ~90 min to timing out after 4h after r972373. This is because verifying all IR types is very expensive apparently, see https://crbug.com/972373 comment 15.
See comment 16 for a repro case, but I'm guessing you can repro by linking any larger executable with thinlto.