jdelauney / SIMD-VectorMath-UnitTest

For testing asm SIMD (SSE/SSE 2/SSE 3/SSE 4.x / AVX /AVX 2) vector math library (2f, 4f, matrix, quaternion...) with Lazarus and FreePascal Compiler
Mozilla Public License 2.0
8 stars 0 forks source link

Stack Alignment #6

Closed dicepd closed 6 years ago

dicepd commented 6 years ago

I thing we may have hit a roadblock with SSE in fpc. At least with using movaps.

Testing the VectorHelpers I came across a SIGSEGV in MoveAround.

This routine does some calls on local vars which are returned on the stack. However looking at the stack address this is not aligned to 16 bytes.

Forum post on subject of something similar, :link: https://forum.lazarus.freepascal.org/index.php?topic=29097.0

Bug report in fpc :link: https://bugs.freepascal.org/view.php?id=32710#c104254

dicepd commented 6 years ago

Ok so {$CODEALIGN LOCALMIN=16} cures it in GLZVectorMath but not using that at a more granular level, all or nothing.

dicepd commented 6 years ago

I will check the tests in so you can play in win64 see if the behaviour is different.

jdelauney commented 6 years ago

I check with and without {$CODEALIGN LOCALMIN=16} the MoveAround. is correct with EPSILON 1e-5 in compare under win64

dicepd commented 6 years ago

It works for me also just doing the comparator test but if I run the timing test for this, this is when I get the stack alignment problem.

jdelauney commented 6 years ago

I didn t see any problems with timing test. I ll redo test this afternoon

jdelauney commented 6 years ago

This afternoon, while i m fixing asm for vector4i. When i ran timing test it raise a sigsegv with multiply Regarding your last issue. In our test our A parameter is Self . So i think fpc unrolled the for..to loop in case of a multiply. It must be push Self on stack for optimzing. And we need to use unaligned load functions. I don t check if is right.

jdelauney commented 6 years ago

Now the question how can we said when Arg is repushed on stack ?

dicepd commented 6 years ago

Just check the timing test for 4i I have not included

{$CODEALIGN RECORDMIN=16} 

in the class vars in the checked in code. That may cure your 4i issue.

jdelauney commented 6 years ago

It's corrected and work