Open PiJoules opened 1 month ago
cc @petrhosek
https://github.com/llvm/llvm-project/commit/2209e3fad62ac6f44232c854f71ae834d5da7434 is an experiment with outlining on the IR level (since that just seemed like the easiest option to implement) and reclaimed roughly half of the size increase back.
https://reviews.llvm.org/D100704 is also a patch we can try resurrecting to get the machine outliner working for Thumb1.
On platforms that don't have target-specific lowering for fixed point intrinsics, the default lowering is to emit the appropriate shifts and arithmetic operations at each callsite (see https://github.com/llvm/llvm-project/blob/ddc3f2dd26c10b830d7137fc5f89049feec29033/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp#L10881 as an example). For baremetal devices that use fixed point as an alternative to soft floats (but also don't have native fixed point operations), this can lead to significant text increase since fixed point operations are expanded at each callisite. When using soft floats, operations like multiplication or division would just be a libcall.
One way we can prevent this scaling per callsite is by deduping/outlining the actual fixed point operations. For example, a multiplication for a given fixed point type could also be a call to some function in the same TU which does that multiplication (ie. something akin to
__aeabi_fmul
but for each of the fixed point types). The compiler could emit this in the backend around target lowering and place it in a comdat section so we ensure one copy of it exists after linking. An alternative is we could add these symbols to llvm-libc and make these truly libcalls but for fixed point types. If this route is take, we'll need to flush out the details futher. This would be something enabled forminsize/-Oz
builds.A more difficult approach could be leveraging the outliner, but I'm not sure what stage in the pipeline the outliner runs.