Open jdelauney opened 6 years ago
Thinking about this one using word pointer is fine for the first two cases. As for using it on parameters which are fetched from global memory I am not so sure as there is no guarantee that the parameter is in memory which is addressable with just a word. I have 20 gig of ram and I do not think a word pointer can address all that.
XMMWORD is a 128bit pointer no ? I'm following you, it will be wiser not to use it on the parameters. But our paramters are normaly aligned and exist so......need more tests at end. But by using this trick for the first two is sure it improves performance a bit
After our discussion here #27 i've made some tests for improving performances we can use
XMMWORD PTR
examples :
in instruction like
andps xmm0, [RIP+cSSE_MASK_ONLY_W]
we can doandps xmm0, XMMWORD PTR [RIP+cSSE_MASK_ONLY_W]
In AngleBetween instead :
we can do
in operators or in function like min, max, clamp, negate, abs, divideby2, MulAdd, MulDiv.... for operator +, instead of
we can do
Using XMMWORD PTR increase performances a little or a lot (depend of case)