Open jroelofs opened 1 month ago
cc @sdesmalen-arm, @rsandifo-arm, @aemerson for SME bits cc @labrinea, as this is kind of FMV-adjacent
breadcrumb for myself: rdar://116990507
Currently, you can write something like the following. This dispatches as appropriate. It currently doesn't optimize well, but that's something we could fix. Given that, what benefit does a dedicated "overloading" feature provide?
void f_streaming() __arm_streaming;
void f_nonstreaming();
inline void f() __arm_streaming_compatible {
if (__arm_in_streaming_mode())
f_streaming();
else
f_nonstreaming();
}
(Also, I'm skeptical you'd want to use this for memcpy(); small memcpys only use streaming-compatible instructions anyway, so it's cheaper to check the size before checking whether streaming mode is enabled.)
We should have an overloading mechanism that dispatches to normal vs streaming implementations of a function at runtime based on the current streaming mode. As an optimization, we could set up dispatch to always call the matching implementation in a normal/streaming caller. In streaming-compatible callers, we'd have to inspect the streaming mode via the compiler-rt builtin.
Use case for this could be to provide a
memcpy
implementation that keeps the streaming mode active or inactive, and uses the corresponding instructions without a streaming mode change.