joncampbell123 / dosbox-x

DOSBox-X fork of the DOSBox project
GNU General Public License v2.0
2.75k stars 381 forks source link

SSE2 and beyond emulation #4185

Open Torinde opened 1 year ago

Torinde commented 1 year ago

Is your feature request related to a problem? Please describe.

Running Win9x/XP software that uses SSE2/3/SSSE3 instructions.

What you want

Add support for SSE2 and beyond instructions

Describe alternatives you've considered

No response

Additional information

Additionally, now that Win2K/XP are becoming more and more feasible to run, maybe SSE2/3 can be added as there are many programs that use those, for example:

Initial issue before EDIT: It seems the CPU type Experimental (for FISTTP) is based on top of Pentium II (MMX) instead of Pentium III (SSE)? add SSE (and other Pentium III CPU type features) to CPU type Experimental.

Have you checked that no similar feature request(s) exist?

Code of Conduct & Contributing Guidelines

Torinde commented 1 year ago

For the SSE2/3:

Alternatively, on the 3DNow! side:

If you want to avoid NX-bit, then:

I think most of those have also PAE/PSE, which allows more than 4GB RAM on Win2000/2003.

joncampbell123 commented 1 year ago

The design inherited from DOSBox SVN prevents DOSBox-X from ever emulating more than 4GB of RAM, nor does DOSBox-X ever intend to emulate more than 4GB of RAM. I'm not sure the design inherited from SVN would allow the longer page table length and size required for PSE to work, and therefore I don't think NX emulation is going to happen. PAE might be possible though.

I do think 3DNow! would be nice to emulate since that is well within the time frame of the DOS to WinXP era.

SSE2 is likely as far as DOSBox-X is going to go in that instruction set as SSE3 and up are associated with much later systems lacking the ISA bus and past the mid 2000s.

Torinde commented 1 year ago

I sidetracked my own topic by brining in SSE2/3. :)

Main purpose of this issue was to mention that because SSE/P3 was added to DOSbox-X after FISTTP/Experimental - the Experimental CPU type lacks SSE, so I think that should be corrected.

Torinde commented 1 year ago

Agree with doing 3DNow! first, then SSE2 (both are used in DOS/Win9x software).

Arguments FOR going beyond to: SSE3

SSSE3

joncampbell123 commented 1 year ago

The experimental CPU type should absolutely support SSE. There is a reason the CPU_ARCHTYPE_EXPERIMENTAL constant is 0xFF, the highest possible value, because code in the normal core is written to consider SSE if CPU_ArchitectureType >= CPU_ARCHTYPE_PENTIUMIII. CPUID emulation also reports itself a Pentium III for experimental type.

Can you show me what combination of software and cputype=experimental is failing to use SSE?

Torinde commented 1 year ago

Hmm... now that I tested again - it seems Experimental has SSE, so it's my mistake. Sorry!

I'm testing with a small program "cpu.exe" and here is its output: image

and with another one "sse.com" from "simdtest.zip" (which has also MMX and SSE3 tests): image

I'll change the issue label to SSE2+

Torinde commented 1 year ago

JHRobotics/mesa9x part of their Win9x 3D accelerated driver set can use SSE3 or SSE4.2 or AVX or AVX2 to get better performance. Also, depending how you interpret the description - maybe SSE3 is required for Win98 and only Win95 can work on Pentium II. Newer Mesa versions may also require later instruction sets? JHRobotics/simd95 is a "Simple hack for enabling SSE/AVX instructions on DOS and Windows 95/98"

Torinde commented 1 year ago

I'm not sure the design inherited from SVN would allow the longer page table length and size required for PSE to work, and therefore I don't think NX emulation is going to happen. PAE might be possible though.

Per Wikipedia NX depends on PAE, not PSE - NX bit: "It is only available with the long mode (64-bit mode) or legacy Physical Address Extension (PAE) page-table formats, but not x86's original 32-bit page table format because page table entries in that format lack the 64th bit used to disable and enable execution."

Torinde commented 5 months ago

It seems a modern desktop CPU has sufficient performance for DOSBox to emulate P4 1.5GHz.