Closed FakelsHub closed 3 years ago
The same code with SSE (without the old FPU)
01542DD0 sub_1542DD0 proc near
01542DD0 movsd xmm1, qword_15E3E80
01542DD8 movaps xmm0, xmm1
01542DDB mov byte_15ADAC1, 0
01542DE2 mulsd xmm0, qword_15A1F78
01542DEA mulsd xmm1, qword_15A1F88
01542DF2 cvttsd2si eax, xmm0
01542DF6 mov ds:_GNW95_repeat_rate, eax
01542DFB cvttsd2si eax, xmm1
01542DFF mov ds:_GNW95_repeat_delay, eax
01542E04 retn
01542E04 sub_1542DD0 endp
It looks like you need to use float instead of double for the compiler to use CVTTSS2SI | Scalar conversion by truncating Float to signed DWord (MMX) CVTTSD2SI | Converting a Scalar Double to Float by truncation
What's the function/feature of that section of code using SSE2 instruction?
I didn't understand you, but I suspect you wanted to know what part of the code relates to these functions? Any double->int conversion
I didn't understand you, but I suspect you wanted to know what part of the code relates to these functions? Any double->int conversion
Yes, that's what I mean. I haven't tested SpeedPatch on my old servers yet. I'll check them later.
There are about 35 cases of using double for variables in the code, I don't know if it'd be safer to change them to float as the extra decimal precision isn't really necessary in most cases IMO.
OK, I tried the current 4.3.1 build on my PII potato with Win2000 (I use some system DLL hacks to make sfall 4.x work), enabling the speed patch doesn't crash the game or something unusual, but because the game itself runs slow enough in HRP 4.1.8 windowed mode (640x480 size), setting 300% speed is barely noticeable in game (NPCs play their idle animation more frequently).
I think the cmp dword_14CF360, 0
is about checking CPU model/features or something, and jumps to corresponding double->int conversion code. There are some other cases on dword_14CF360 in the code of phobos build.
I think the cmp dword_14CF360, 0 is about checking CPU model/features or something
This is unknown. I have this set to 1.
Against this assumption is the use of fstp
, why use FPU if there is SSE.
Just curious, does changing the datatype of variables in SpeedPatch.cpp from double to float help the case? I still see the same ASM code when using dumpbin /disasm
to disassemble the binary.
There are still some modules use 'double' datatype variables that might have the same double->int conversion: InputFuncs.cpp, Combat.cpp, Skills.cpp, Stats.cpp, WindowRender.cpp, Worldmap.cpp.
for me, the instructions for speedpatch have changed from double to float.. CVTTSD2SI -> CVTTSS2SI
Oh, OK. I only check the code of the supposed double->int conversion.
I think the code of your 2nd comment is now this, using cvttss2si
:
Not sure if you don't use any double the first code would still be in the binary, or all of the lines that were calling double->int conversion would be replaced with cvttss2si
.
Found in internet :-)
013B1030 call _ftol2_sse (13B19A0h)
013B19A0 cmp dword ptr [___sse2_available (13B3378h)],0
013B19A7 je _ftol2 (13B19D6h)
013B19A9 push ebp
013B19AA mov ebp,esp
013B19AC sub esp,8
013B19AF and esp,0FFFFFFF8h
013B19B2 fstp qword ptr [esp]
013B19B5 cvttsd2si eax,mmword ptr [esp]
013B19BA leave
013B19BB ret
Oh, so the code is part of the generic ftol2
function, and the cmp dword is indeed a CPU feature (SSE2) check.
At least it looks like I don't have to worry about existing features suddenly crashing on older systems.
I'll toy with the idea of replacing other 'double' type variables with float later.
I'll toy with the idea of replacing other 'double' type variables with float later.
it doesn't make sense.
I'll toy with the idea of replacing other 'double' type variables with float later.
it doesn't make sense.
Maybe, but what's the difference between changing double type variables in SpeedPatch and InputFuncs or Worldmap? What's special about SpeedPatch? Because it's only one uses cvttsd2si
for double->int conversion?
I changed this to float, because the function is called very often, it is possible that sse float instructions work faster (but this is not a fact). For FPU, I do not know how this will affect, because without sse, there is a large amount of code to convert. Compare your code with double and float.
Compare your code with double and float.
OK, did some silly comparisons in your build:
WindowRender.cpp - fadeMulti:
mmword
-> dword
sd
instructions to ss
(Scalar Double-Precision -> Scalar Single-Precision)Worldmap.cpp - Passed, because original FO1 also uses double type for tick calculation on the world map.
InputFuncs.cpp - mouse speed
cvttsd2si
-> cvttss2si
sd
instructions to ss
Skills.cpp - multipliers
Stats.cpp - StatFormula.multi[]
Combat.cpp - KnockbackModifier.value
Verdict: yep, changing double to float doesn't matter much, as majority of the rest don't get called as frequent as SpeedPatch, maybe except InputFuncs (mouse movement happens a lot in game obviously).
https://github.com/phobos2077/sfall/blob/6c3dd72ea757d047c7f84002246de039b80f31a8/sfall/ddraw.vcxproj#L39-L44 It seems that it is useless for you to set this parameter
<CharacterSet>NotSet</CharacterSet>
, the compiler still uses SSE commands.here is an example of recent code from your sfall using SSE2
cvttsd2si