PSI-Rockin / DobieStation

A dog-themed PS2 emulator
GNU General Public License v3.0
523 stars 55 forks source link

The floating-point problem #51

Open PSI-Rockin opened 6 years ago

PSI-Rockin commented 6 years ago

One of the biggest limiting factors in PS2 emulation is floating-point imprecision. Both the FPU and VUs have no support for NaNs or infinities, so a PS2 emulator must clamp those kinds of numbers. But the worst factor is that the PS2 has a wider range of floating-point values than an x86 FPU.

For example: the value 0x7FFFFFFF in a floating-point representation is a valid number on the PS2, but a NaN on x86. The closest possible representation is 0x7F7FFFFF on x86. It may seem silly to believe that this makes a big difference, but time and time again, gamedevs have proven how evil they can be. The majority of non-GS hacks in PCSX2, to this day, involve patching games that rely on specific floating-point configurations.

How can this be solved? I know of two solutions:

ijacquez commented 6 years ago

What would be the highest floating point value represented on the PS2?

andriii25 commented 6 years ago

More specifically, how many distinct values can a PS2 floating point number take? I was thinking about some kind of look-up table method to translate between values if it's possible. That could be reasonably fast I think.

PSI-Rockin commented 6 years ago

By my count, there are 16 million distinct values that are NaNs on x86 but valid numbers on the PS2.

I think the highest possible number is something like 1e+127? That would be 7FFFFFFF

tokumeiwokiboushimasu commented 6 years ago

By my count, there are 16 million distinct values that are NaNs on x86 but valid numbers on the PS2.

This is too big for a lookup table. Is it possible to mask some bits and then use a lookup table?

PSI-Rockin commented 6 years ago

You could mask the sign bit (bit 31), but that still leaves you with 8 million values. Masking any of the remaining bits would result in a loss of accuracy, which brings us back to the same problem.

Nobbs66 commented 6 years ago

I would just do soft floats, or add a toggle to switch between your two options

ijacquez commented 6 years ago

@Nobbs66 The downside to this is having to juggle two implementations. Or maybe I misunderstood what you mean.

Would it make sense to just go down the path of emulating PS2's floats in software for compatibility, and down the road maybe a way to improve performance could arise?

Ravenslofty commented 6 years ago

I think it's worth chipping in with the approach I have in mind for ocps2, which is very much in its infancy, but I have pages of plans for how it'll work.

The code emitted for floating point uses hardware instructions, but contains a bunch of checking for infinities and NaNs, but bails out to edge-case code for those situations, and then inlines the edge-case code for next time.

This is slower than hard float, but faster than soft float.

hch12907 commented 6 years ago

@ZirconiumX Wouldn't the checks slow down the emulator unless the CPU predicts them correctly?

Ravenslofty commented 6 years ago

They'll basically always be predicted correctly, because if they are ever true then the JIT will just emit more careful code.

unknownbrackets commented 5 years ago

In case it helps, I recall this was a concrete example of a game and algorithm that needed accuracy: https://github.com/unknownbrackets/ps2autotests/pull/4

(not sure if this project is referencing ps2autotests at all, which tbh I've been neglecting...)

Even if prediction is always the same, there's always a cost to having more instructions of course. I wonder what the performance tradeoff between doubles and checks would be. Also, if there's a performance-critical decryption/hashing algo, constantly falling back could be worse than doubles.

A more complex approach could be to emit simple single precision code with checks, and then backpatch the emitted code to use doubles and conversion if a check fires with double usage instead. Similar to the backpatching Dolphin does for address loads.

Also, I don't know if some PS2 games use "safe" floats consistently. Another approach (which we use in PPSSPP) is to assume the simpler option, and then if something is detected that invalidates this assumption, invalidate all funcs in jit and rebuild with the new assumption. PPSSPP uses this for rounding mode, for example (many games never set a different one than the default.)

-[Unknown]

weirdbeardgame commented 5 years ago

I think the biggest issue is how much clamping can we avoid across platforms, like in Android's case are you willing to maintain a separate floating point implementation that can cope with the lack of power? Where a PC could use soft floats or a double flip like you mentioned.

unknownbrackets commented 5 years ago

AArch64 generally supports double: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0024a/ch04s06s01.html

But indeed, armv7 (older phones) does not support single-precision. For reference, the Raspberry Pi 3 has a 64-bit processor, but a 32-bit OS, so that's probably the newest device that wouldn't support doubles I suppose.

-[Unknown]

weirdbeardgame commented 5 years ago

It's not an issue with the device supporting it per say it's wether or not there's enough power in said device to handle that much accuracy from the PS2.

We have as I see it two routes, make one floating point implementation that makes a sacrifice somewhere either in accuracy or in device support.

Or we make the obviously correct choice and make separate per platform implementations that those devices could handle

unknownbrackets commented 5 years ago

Well, since the proper range is used for logic (i.e., Radiata Stories simply won't run without correct float handling), not sure that's a great option. I wouldn't recommend going down a path that intentionally means significant differences in compatibility between platforms...

-[Unknown]

PSI-Rockin commented 5 years ago

So, I think I figured out the best solution: instruction-level toggling between "hard" and "soft" floats.

Clamping/rounding bugs are caused by subtle differences between the expected and actual output. I believe, because of this, it is safe to assume that only a few instructions in any given game are responsible for the bugs.

A pure softfloat approach is going to be incredibly slow, too slow even. But if only a couple of instructions use softfloats in a game, the speed impact might be negligible. This also allows us to support games that have bugs on all clamping modes (Gran Turismo 4, Destruction Derby Arena) and even games that need exact floating-point accuracy (Tri-Ace games).

There are a couple of problems with this idea. One, it will take a lot of time to produce softfloat equivalents for every instruction, as there are dozens of instructions. Two, figuring out which instructions need softfloat for every game is also time-consuming. I don't have a good solution for the second problem, but perhaps it could be possible to determine the most commonly bugged instructions and work from there.

Degerz commented 5 years ago

Is it possible that we can just profile the game IDs and load game specific settings while letting the community figure out the needed soft float instructions to create those settings ? I think this would be the most flexible approach as it would give us both speed and accuracy where necessary or is this approach just too infeasible for the community to handle ?

How many floating point instructions are there for the FPU and the VUs separately ?

PSI-Rockin commented 5 years ago

It's entirely possible for the community to create profiles for each game. DobieStation uses the equivalent of full EE and VU clamping on PCSX2, and we can use PCSX2's GameDB as a reference for games that need different clamping modes.

The FPU has around 20 or so instructions, and the VUs have a much higher number - close to a hundred, possibly more. It would be easy to just have a menu for all FPU instructions, but the VUs will need a better solution, as I don't expect the average person to test dozens of options for their game.

Degerz commented 5 years ago

How many of the VU instructions do you wager would need a soft float implementation so that we can reduce that count ?

Or is it possible that we can just create groups of VU instructions by the frequency of their usage ?

ghost commented 4 years ago

This is more or less the Sony Netemu method. In the config file you can choose on what instruction you need a more accurate float approach. It of course, make the game slower when said instructions are used but on PC, it should not be a problem.

seta-san commented 1 year ago

By my count, there are 16 million distinct values that are NaNs on x86 but valid numbers on the PS2.

This is too big for a lookup table. Is it possible to mask some bits and then use a lookup table?

Yes. But I really wonder how many of them games really run up against. Could the list be whiddled down to a set of essentials to make certain games work. Maybe even keeping these lookup tables on a per game basis?