Open alexrp opened 3 weeks ago
Is this phenomenon intrinsically linked to callsites and/or the function boundary?
I imagine the compiler might replace any code section with an equivalent builtin.
For the use case of implementing compiler_rt
, where the functions map 1:1, a call modifier would be a natural fit,
but maybe a nobuiltin
/nointrinsify
block (similar to nosuspend
blocks) would be more flexible / useful in general?
What also looks strange to me is that the linked no_builtin
attribute is declared for the callee function, while a CallModifier
is currently supplied at the caller's @call
.
EDIT: Ah, so LLVM puts it on the caller but clang uses a callee attribute? That's weird, but maybe normal for LLVM's inconsistencies. It would still be worth picking the more sensible choice for Zig though.
For builtins only called via the compiler it might not matter, but it seems strange to me to generate a single function twice - once for never_intrinsify
callers and once for callers that allow builtins.
For the use case of implementing compiler_rt, where the functions map 1:1, a call modifier would be a natural fit, but maybe a nobuiltin/nointrinsify block (similar to nosuspend blocks) would be more flexible / useful in general?
Not in any order, but here would be my counter points:
never_intrinsify
is necessary at specific callsites, so I don't think it makes much sense for it to be inside of the callee.never_intrinsify
would be quite rarely used, so making it a "language semantic" seems superfluous and unnecessary to me.
never_intrinsify
is necessary at specific callsites, so I don't think it makes much sense for it to be inside of the callee.
In my understanding (which may be wrong), f.e. in the use case of our own memcpy
implementation,
that is a function (callee) we provide in compiler_rt
, which we never want to be translated as a call to a builtin (no matter the caller/callsite).
The fact that the compiler is the one emitting the calls (generating callsites) makes it feasible to always specify this as a callsite-attribute,
but I don't see the point of ever allowing a call to a compiler_rt
function to be replaced by a compiler intrinsic - in that case I'd expect callers use builtins like @memcpy
etc. .
never_intrinsify
would be quite rarely used, so making it a "language semantic" seems superfluous and unnecessary to me.
Probably true, afaiu hand-written assembly already isn't affected by these builtin-replacements? It might still be useful for some micro-optimization use cases, but those probably shouldn't be prioritized.
Is this phenomenon intrinsically linked to callsites and/or the function boundary? I imagine the compiler might replace any code section with an equivalent builtin. For the use case of implementing
compiler_rt
, where the functions map 1:1, a call modifier would be a natural fit, but maybe anobuiltin
/nointrinsify
block (similar tonosuspend
blocks) would be more flexible / useful in general?
That may be true, but:
I think a call modifier strikes a good balance with regards to complexity and usefulness.
What also looks strange to me is that the linked
no_builtin
attribute is declared for the callee function, while aCallModifier
is currently supplied at the caller's@call
.
I don't quite follow here. The Clang attribute is put on a function, and any code or calls within that function won't be transformed to builtins. It's like if you'd manually written every call in the function as @call(.never_intrinsify, ...)
in Zig.
Updated description to note that this modifier would still permit inlining at the compiler's discretion like auto
.
Is this phenomenon intrinsically linked to callsites and/or the function boundary? I imagine the compiler might replace any code section with an equivalent builtin. For the use case of implementing
compiler_rt
, where the functions map 1:1, a call modifier would be a natural fit, but maybe anobuiltin
/nointrinsify
block (similar tonosuspend
blocks) would be more flexible / useful in general?That may be true, but:
1. I don't currently have a motivating use case for such a broad feature.
I don't quite understand how a call modifier for use with @call
would help in the linked issue, I may just be misunderstanding that issue. I assumed the linked issue is something like I encountered (detailed below) but I don't know what the zig source for that issue is, so I'm not sure.
One place where being able to mark a function/block with nobuiltin
would have been nice to have for me is in implementing things like memcpy
in compiler-rt. At the moment, you need to be careful to not have llvm codegen recursive calls to memcpy - basically you have to make sure llvm doesn't recognise code paths (not just function calls) as something it thinks it can replace with a call to memcpy. Sometimes utility functions needed to be made noinline
to beat llvm's recognition, and in those cases a nointrinsify
call modifier might have been preferable, but I also needed rewrite some simple copying loops as well, as they could be turned into memcpy
calls in some cases. Without being able to mark a loop (or the function containing it) with a nobuiltin
annotation, this means the code might break whenever llvm changes the way their memcpy detection changes, and can also make some things look more complicated than needed for no apparent reason.
It would help because this is the implementation of the problematic function(s):
The issue is that LLVM is recognizing the call to memcpy
by name and turning it into an intrinsic call instead, which then gets turned into a (recursive) call to __aeabi_memcpy
in the target backend.
Putting nobuiltin
on the call site will prevent LLVM from doing this.
With regards to the compiler-rt problem you're having, I'm afraid you're a victim of this code:
The attribute needed to get the desired effect here is no-builtins
, not nobuiltin
. I mean, obviously!
Does the no-builtins
exist? I can't find anything reasonable about it in llvm repo.
https://github.com/llvm/llvm-project/blob/11df0ce1405ec3e3721b43764dc53250aa9e08a1/llvm/include/llvm/IR/Attributes.h#L86
https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/IR/Attributes.td
LLVM doesn't have a fixed set of predefined attributes.
This modifier prevents the compiler from turning a function call into an intrinsic/builtin. In other words, when you call a function using this modifier, you're guaranteed to actually get a call to that function in the generated code (note: inlining is still permitted). Concretely, it would map to the
nobuiltin
LLVM attribute at the call site, and whatever equivalent exists for other backends.Motivated by #21831 (and I suspect numerous other cases in compiler-rt if I went digging).
FWIW, Clang has this in the form of the
no_builtin
attribute, so I think it's important that Zig also be able to express this.