Open shivaramaarao opened 2 years ago
This can occur if the vector type isn't aligned - does it work if you specifiy 16 byte alignment for the pointers?
As far as I can tell, LLVM optimizations are doing the right thing here. The accesses to the local variables ArrayPt
and Pointer
are marked non-temporal, so they get emitted that way at -O0. At higher optimization levels, SROA eliminates those variables.
I think clang is misintepreting the OpenMP standard here. The directive nontemporal(ArrayPt,Pointer)
is supposed to be mean that array accesses through the pointers ArrayPt
and Pointer
should be non-temporal. But clang is interpreting it to mean that accesses to the variables themselves are supposed to be non-temporal. Granted, the standard is badly worded; it doesn't explicitly say that operands to nontemporal
directives have to be pointers, so maybe clang's current interpretation could be justified.
@llvm/issue-subscribers-clang-codegen
@llvm/issue-subscribers-openmp
I think we interpret the standard wrong and should mark the accesses via those pointers as nontemporal.
Do we agree that Clang is misinterpreting OpenMP standard and we should mark all the accesses via those pointers as nontemporal?
I am planning to add support for simd nontemporal clause to OMPIRBuilder and I would like to clarify the proper behavior of simd nontemporal pragma.
@DominikAdamski Is the issue still around?
For the following program, compiler is expected to generate nontemporal instructions
Yes, the non-temporal instructions are generated at O0 (on x86 platform).
used the option :
clang -fopenmp <test.c>
At O1 and above, the non-temporal hint seem to be ignored and the non-temporal instructions are not getting generated.
Is it expected behavior?. I think the nontemporal metadata is not getting propagated incase of optimizations. The meta data seem to get dropped at SROA pass.