Open gfoidl opened 5 years ago
/cc: @tannergooding please have a look here
I remember that ToScalar
on integer types are not intrinsic now. You can use Sse2.X64.ConvertToInt64
for Vector128<long>
.
Thanks.
Sse2.X64.ConvertToInt64
Is better, but still not ideal:
vmovupd xmm0, xmmword ptr [rsp+08H]
vmovd rax, xmm0
It is documented as __int64 _mm_cvtsi128_si64 (__m128i a) MOVQ reg/m64, xmm
, but JIT didn't emit the movq
.
Is that output from the JIT's own disassembler? It's probably movq
but displayed as movd
.
Looked at the JIT-dump and in VS-dissambly view (both in release with optimization on, tiering disabled).
SharpLab with CoreCLR shows the same.
Right, vmovd rax, xmm0
is actually vmovq rax, xmm0
.
movd
is an alias of movq
on r64.
SharpLab with CoreCLR shows the same.
BTW, in this link, vzeroupper
is generated for Vector128 code. That should not be there, I will take a look.
movd
is an alias ofmovq
on r64.
👍
So codegen could be
vmovd rax, xmmword ptr [rsp+08H]
or
vmovq rax, xmmword ptr [rsp+08H]
There is the extra vmovupd
(see https://github.com/dotnet/coreclr/issues/24710#issuecomment-494720476)
I will take a look.
Thanks.
vzeroupper
is generated for Vector128 code. That should not be there,
Isn't this needed for VEX? No matter if Vector128 or Vector256.
It is a bit complex, please see https://github.com/dotnet/coreclr/issues/21062. But I am sure that codegen has something wrong.
Marking as future; if there's something surgical we can fix, or there's a bug, we can move to 3.0.
It might be related. xmm0
is spilled to the stack.
private static long AsLong(double dbl)
{
return *(long*)&dbl;
}
@omariom for reference: this is tracked by https://github.com/dotnet/runtime/issues/11413 (thx @EgorBo for the remainder).
It might be related.
xmm0
is spilled to the stack.
@omariom What about this?
unsafe class C
{
private static long AsLong(in double dbl)
{
return *(long*)Unsafe.AsPointer(ref Unsafe.AsRef(dbl));
}
}
Asm output:
C.AsLong(Double ByRef)
L0000: mov rax, [rcx]
L0003: ret
@hypeartist this uses also the stack [rcx]
and doesn't operate with registers solely.
Vector128<long>.ToScalar()
stores thexmm
to the stack, then readsr64
from there via amov
.Ideally this would use
movq
(c++ intrinsic:_mm_cvtsi128_si64
), so asm becomes:Vector128<double>.ToScalar()
produces expected code (vmovsd
) -- no issue there. Same CQ issue forint
, and forVector256<T>
. Didn't check other types, than noted here.category:cq theme:vector-codegen skill-level:intermediate cost:medium