Closed VladiKrapp-Arm closed 1 month ago
Very minor nit: the first patch should start 0001
, i.e. 0001-LTOpasses-add-loop-unroll.patch
to match the other folder and the likely output of git format-patch
The placeholder patch was numbered 0 since it's not a real patch file.
@stuij , @dcandler , I have made the suggested changes. Any other issues?
Some workloads require specific sequences of events to happen to fully simplify. This adds an extra full unrolling pass to help these cases on the cores with branch predictors. It helps produce simplified loops, which can then be SROA'd allowing further simplification, which can be important for performance. Feature adds extra compile time to get extra performance and is enabled by the opt flag 'extra-LTO-loop-unroll' (off by default).
Original patch by David Green (david.green@arm.com)