Open andrewrk opened 11 months ago
In theory, it should be fine, since the result does not actually depend on any elements past the first true element. Whether subsequent undefined elements are treated as true or false does not matter, since a well-defined true value will be the smallest index.
Excuse me if I'm missing something, but could it cause an exception if the well defined bytes are at the end of the memory page but the "undefined" bytes accessed by SIMD instruction are out of the page? If that are the scenarios which are detected by Valgrind, then that's why that detection exists? The question is maybe how then to cover the cases "we're sure the access is safe even with undefined content"?
With regard to ghost's point that the undefined bytes might be in a separate memory page: That page may not exist, and thus the SIMD instruction will cause a SIGSEGV or whatever. For instance, in NeXTSTEP, if the defined bytes are from the default allocator, the subsequent bytes may well be at an address that isn't mapped in at all. (Learned this the hard way, with a read-immediately-after-free bug that SEGV'd when NeXTSTEP unmapped the page that had just become completely free.)
Note that the std.mem code in question does not cross a page boundary.
Extracted from #17209.
I don't really know how to solve this problem, but at least here is a tracking issue for it.
The problem is that, while the first 6 bytes of the buffer are well-defined, including the null byte, the bytes after it are undefined.
For a naive loop, no loads of undefined bytes will occur:
However, the Zig standard library is now taking advantage of SIMD:
https://github.com/ziglang/zig/blob/cc394431ae6eb69e7abd677c268a8ab7299f8aeb/lib/std/mem.zig#L1144-L1145
Technically this is indeed doing a memory load of some undefined bytes. However, since it's returning the index of the first null byte, it has no dependency on the undefined memory.
Maybe we can work with the Valgrind project to resolve this? Maybe there is already some client request mechanism so that we could communicate this pattern to Valgrind?
This also impacts the Zig language specification. It needs to be well-defined to do a SIMD operation on an undefined value as long as a conditional branch has no dependency on the undefined value.
For example,
@reduce(.Or, v))
should produce a well-defined value if any of the elements of the vector are true.However, making firstTrue sound is slightly more tricky:
https://github.com/ziglang/zig/blob/cc394431ae6eb69e7abd677c268a8ab7299f8aeb/lib/std/simd.zig#L291-L301
In theory, it should be fine, since the result does not actually depend on any elements past the first true element. Whether subsequent undefined elements are treated as true or false does not matter, since a well-defined true value will be the smallest index.
However, that only works if
@select(T, v, a, b)
produces a well-defined value for every element of its result. For scalar operations in Zig, this is not the case - an operation with undefined operators propagates undefined to the result. However, in this case, it could result in the following scenario:vec
contains.{false, true, undefined}
.@select()
produces.{maxint, 1, undefined}
@reduce()
receives0
as the value for the undefined element, causing it to result in0
instead of the correct result which is1
.Perhaps
@select
could be defined to treat any undefined elements in the predicate as resulting in either the correspondinga
element, or the correspondingb
element, rather than undefined.This maps to the concept of freeze in LLVM IR.