Closed RalfJung closed 7 months ago
r? @Amanieu
rustbot has assigned @Amanieu. They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.
Use r? to explicitly pick a reviewer
CI failures seem unrelated to this PR.
This is my proposed alternative to https://github.com/rust-lang/stdarch/pull/1457. It builds on the model I sketched here.
Changing the docs alone is not sufficient IMO, we also should stop using LLVM's non-temporal hint as LLVM passes do not properly take into account the non-standard nature of non-temporal stores. Originally I thought these all have to become inline assembly, but now I noticed that some of these operations are using different LLVM intrinsics:
Do these intrinsics exist for the other operations as well, e.g. for movntps? I found
llvm.x86.avx.movnt.ps.256
but ofc for the SSE intrinsic we'd want the 128bit version of that. If that exists, using those might be better than using inline assembly? Though of course if LLVM passes treat these intrinsics like regular stores, that would still be wrong.