Open jdoerfert opened 4 years ago
This problem might not be limited to multi-threading contexts. For example,
llvm.read_register
is marked as ReadMem, but it seems like we treat it like only considering other memory operations in the program, not the case that the memory could be modified externally. For an example, see https://reviews.llvm.org/D80911
The way we currently deal with volatile/atomic/synchronizing operations in alias analysis is to treat them as a "write" to some memory. So it isn't legal to mark any operation that does one of those things readonly.
Actually, rereading this ticket, it occurs to me that maybe we should just say the current behavior of optimizations is correct, and document that inaccessiblememonly implies nosync, or something like that.
This problem might not be limited to multi-threading contexts. For example, llvm.read_register
is marked as ReadMem, but it seems like we treat it like only considering other memory operations in the program, not the case that the memory could be modified externally. For an example, see https://reviews.llvm.org/D80911
Currently the description of IntrReadMem/IntrWriteMem seems to imply that it only considers memory operations visible to LLVM. Might be worth to clarifying across all memory related attributes.
I'll look into teaching this BasicAA. Eli, Hal, thanks for the input.
There might be a few callers that could usefully distinguish between "this call may modify the given object" and "this call can't modify the given object, but it can synchronize with other code which might modify it". But I assume most can't: GVN can't do anything useful with the distinction. And AA is in the best position to figure out whether code in other threads can modify the object: it's already doing object identification/escape analysis/etc. anyway.
I agree. It's not obvious to me how a pass would make use of the distinction. Synchronizing with something that modifies a variable looks just like modifying a variable in every case of which I can currently think. It means that, before the call, it would not be valid to observe a different value from that memory location (because that would be a race), and after the call, it would be valid to observe a different value.
There might be a few callers that could usefully distinguish between "this call may modify the given object" and "this call can't modify the given object, but it can synchronize with other code which might modify it". But I assume most can't: GVN can't do anything useful with the distinction. And AA is in the best position to figure out whether code in other threads can modify the object: it's already doing object identification/escape analysis/etc. anyway.
I guess aa is one way, I'm not sure though because this problem should also appear for readnone
gpu shuffles that doe synchronize. Maybe we want synchronization to be a first-class citizen and not something we tag onto aa?
I guess this is an alias analysis bug? If it's possible for a inaccessiblememonly to "access" memory, aa should be aware of that, and if aa is returning the right thing, gvn should automatically just work.
@jdoerfert are you still looking at it?
Actually, rereading this ticket, it occurs to me that maybe we should just say the current behavior of optimizations is correct, and document that inaccessiblememonly implies nosync, or something like that.
It seems the doc doesn't reflect this yet? https://github.com/llvm/llvm-project/blob/main/llvm/docs/LangRef.rst#L1853
from the attached test case, there are instances of inaccessiblemem
both with and without nosync
@efriedma-quic do you think we should update the documentation of inaccessiblemem?
Yes, it probably makes sense to clarify the documentation.
sg, started a patch: https://reviews.llvm.org/D157765. please review.
Extended Description
The attached test mimics a "critical region" (see llvm/llvm-project#45199 ) with an update inside. In both cases the calls surrounding the load+store pair are annotated with inaccessiblememonly. However, only in the case the call is marked
nosync
it is sound for GVN to hoist the load out of the loop. Basically, the call might be a barrier to prevent races, wait for a signal indicating the global was initialized, ...