jdelauney / SIMD-VectorMath-UnitTest

For testing asm SIMD (SSE/SSE 2/SSE 3/SSE 4.x / AVX /AVX 2) vector math library (2f, 4f, matrix, quaternion...) with Lazarus and FreePascal Compiler
Mozilla Public License 2.0
8 stars 0 forks source link

SSE Calling conventions for Packed int. #9

Open dicepd opened 6 years ago

dicepd commented 6 years ago

Starting a listing of int calling convention solutions which seem to be different from single calling convention.

For Unix64 we can avoid a stack copy operation by using the following: using pascal calling convention: assembler; register; nostackframe;

Result is in xmm0

movhlps xmm1, xmm0 movq RAX, xmm0 movq RDX, xmm1

so for 4i+ this code works

class operator TGLZVector4i.+(constref A, B: TGLZVector4i): TGLZVector4i; assembler; register;  nostackframe;
asm
   movdqa  xmm0, [A]
  {$ifdef TEST}
    paddd  xmm0, [B]        
  {$else}
    movdqa xmm1, [B]
    paddd  xmm0, xmm1
  {$endif}
  movhlps  xmm1, xmm0
  movq     RAX,  xmm0
  movq     RDX,  xmm1
end;      

movaps [RESULT], xmm0 only works if we do not use nostackframe. even then it is a 128bit value on the stack which is then copied off the stack into RAX and RDX to be stored in the callers stack frame on return