Open tersec opened 7 months ago
I assume the disregarded idempotence of the memset
to 0 operation, to which nimZeroMem
effectively boils down in C/C++, would eventually come at the cost of a redundant mov
asm instruction for optimization switches >= /O1 (BTW, wouldn't that be mostly irrespective of tyObject_HSlice__1F9c6PBLtnXQNAmUXyCBSBw
's size as soon as it surpasses register capacity, which does yield some additional latency?)?
Haven't checked Compiler Explorer, but do compilers in all their matured cleverness actually fail to optimize that seemingly obvious redundancy away for >= /O1?
Still, even from a codegen (build time) and debug mode perspective, I gather there is room for improvement here. If this issue is still up for grabs, I'd be willing to tackle it for my hovering around that area of codegen adventures anyway at the moment. 😉
, would eventually come
zeroMem
elision works for simple / small cases where the compiler can prove locality, but not for larger types where it matters more - there are several problems, such as the fact that passing things by pointer to the compiler causes it to become conservative in what is overwritten and what is not since pointers may alias.
As the issue notes, this is a pervasive problem across all kinds of codegen, where zeroMem is sprinkled liberally and redundantly.
there are probably more issues reported already - these apply to ORC/ARC as well btw.
Description
is one example, but browsing generated C with
--mm:refc
shows many of them.C compilers aren't necessarily reliable, nor should they be, at detecting that this is a verifiably idempotent operation, so the later calls can be removed, so it creates real runtime (profiled, even) overhead.
Example of generated duplicate
nimZeroMem
for that code in@mNim@slib@spure@sstrformat.nim.c
:In this case, an
tyObject_HSlice__1F9c6PBLtnXQNAmUXyCBSBw
isn't huge, but for larger objects and/or in loops, this becomes more significant.Nim Version
Current Output
Expected Output
Possible Solution
No response
Additional Information
No response