Closed recursion-ninja closed 1 month ago
There is a build failure for GHC versions before 9.6 which seems related to UnboxedSums
(it's available since 8.2.1, but maybe something changed?).
Also, there is a docspec-related error, seems to happen on all Ubuntu builds.
I have isolated and corrected the allocation regression in rawPageLookup
that the PR contained. The excess allocation came from the "linear scanning" portion of the bisectPageToKey
function.
Previously, when the bench-marking showed identical allocations and comparable run-times, there was a defect in the bisectPageToKey
function. To correct the defect, an additional check was required to determine of the binary search ended at the last entry in the page. If so, the binary search result needed to be changed from "found closest key at index i
" to "key not found in page." This additional comparison work, which existed in the terminal "linear scan" portion of the binary search routine, was inexorably linked to the additional allocation.
I attempted to alter the code while retaining the linear scan portion of the binary search routine. After repeated an persistent failure to remove the additional allocation, I removed the linear scan segment of the binary search routine. To my pleasant surprise, the allocations decreased to the same or less than the allocations reported on main
!
Below is a benchmark table I collated describing the change in bytes allocated (B) and run-time measured (ns) between main
and this PR across recent versions of GHC. In the Δ column, a negative number represents an improvement when measuring the performance of this PR (relative to main
).
rawPageLookup |
GHC | main (B, ns) |
PR-#346 (B, ns) | Diif-Δ (B, ns) |
---|---|---|---|---|
missing |
9.10.1 | (544 , 101.50 ) |
(544 , 104.90 ) |
(± 0 , + 3.40 ) |
existing-head |
9.10.1 | (560 , 98.72 ) |
(560 , 101.70 ) |
(± 0 , + 2.98 ) |
existing-last |
9.10.1 | (640 , 96.91 ) |
(576 , 88.27 ) |
(-64 , - 8.64 ) |
missing |
9.8.2 | (544 , 102.90 ) |
(544 , 104.90 ) |
(± 0 , + 2.00 ) |
existing-head |
9.8.2 | (560 , 100.40 ) |
(560 , 103.00 ) |
(± 0 , + 2.60 ) |
existing-last |
9.8.2 | (640 , 98.92 ) |
(576 , 91.17 ) |
(-64 , - 7.75 ) |
missing |
9.6.6 | (544 , 104.20 ) |
(544 , 103.50 ) |
(± 0 , - 0.70 ) |
existing-head |
9.6.6 | (560 , 99.82 ) |
(560 , 90.30 ) |
(± 0 , - 9.52 ) |
existing-last |
9.6.6 | (640 , 97.21 ) |
(576 , 88.48 ) |
(-64 , - 9.73 ) |
missing |
9.4.8 | (544 , 102.90 ) |
(544 , 105.40 ) |
(± 0 , + 2.50 ) |
existing-head |
9.4.8 | (560 , 100.00 ) |
(560 , 103.10 ) |
(± 0 , + 3.10 ) |
existing-last |
9.4.8 | (640 , 99.06 ) |
(576 , 88.84 ) |
(-64 , -10.22 ) |
My reading of the tabulated measurements is that the PR, with the recent regression fix, now constitutes:
I believe that this resolves the last critical technical issue of the PR. I'm going to give the PR a final review to ensure that all comments have been addressed.
Description
First part of Work Product 12, adding the functionality to start
RunReaders
at an offset.The new
rawPageFindKey
function has the same performance characteristics asrawPageLookup
but facilitates starting theRunReaders
at an give page offset from the returned index.RunReader.new
now takes an optional offset as an argument.