joncampbell123 / dosbox-x

DOSBox-X fork of the DOSBox project
GNU General Public License v2.0
2.77k stars 381 forks source link

3DNow!+Pro emulation #3217

Open Torinde opened 2 years ago

Torinde commented 2 years ago

Code of Conduct & Contributing Guidelines

Have you checked that no other similar feature request(s) already exists?

Is your feature request related to a problem? Please describe.

Running many games and software in 3DNow! mode

What you want

Adding emulation of 3DNow!+ instructions to the Penitum III type (#3179), resulting in Athlon XP type with:

Adding emulation of the last 3DNow! instructions as Experimental type (in addition to Athlon XP, FISTTP, Pentium II SYSENTER/SYSEXIT):

Adding Athlon XP 1GHz 1200+ (or the less catchy named Mobile Athlon 4 850 MHz) cycles to the "Emulate CPU speed" list.

Addintionally, for a gradual raise of the cycles requirements, add:

Hackipedia.org 3DNow! https://www.ardent-tool.com/CPU/Docs_AMD.html

Attached table with CPU types, instruction sets, launch dates, cycles from the DOSBox-X "Emulate CPU speed" list menu and DOSBox wiki. DOSBox wiki goes up to Pentium III 500MHz, so it would be nice DOSBox-X to give some guidance to the users what host CPU is needed to emulate Ahtlon XP (example of benchmark procedure). Hypervisor-type core (#1089) and other performance optimizations (#1184) will help.

image CPU instructions Win9x era.xlsx

Describe alternatives you've considered

PCem, Bochs, QEMU

Additional context

Of course comes the question of the limit of emulation aspirations (#3196). DOSBox-X websites mention as aim Win9x/Me, pre-2000, pre-WinXP, ISA slot support and maybe other. Windows Millenium was launched 2000-09-14 and had official support until 2006-7-11. Windows XP was launched 2001-10-25. Windows Vista/2008 server are the latest to support ISA slots.1

3DNow!, Cyrix EMMI 1,2,3,4 (any software that uses EMMI?),5 and NEC Vxx 1,2,3 (those CPUs were used in NEC PC-98) are clearly included in the preceding era.

Athlon XP is a very good for next target as it upgrades the current SSE implementation with 3DNow!, which is relevant for a lot of games in the DOS/Win9x era.

Penitum 4 (SSE2) was also launched an year before WinXP. SSE2 is required for later versions of various internet browsers 1,2.

Pentium 4 Prescott adds SSE3 - relatively few instructions, one of them is already supported in Experimental (FISTTP, #2526), also helps with internet browsers 1 Athlon 64 adds SSE2/x86-64 - there is work on 64-bit DOS extenders 1,2,3 #3269 (any possibilities to use x86-64 under Win9x?) Latest models of both (Pentium 4 662, Athlon 64 AM2, etc.) support simultaneously SSE3 and x86-64.

Finally, Core 2 adds SSSE3 (also available in Atom without x86-64: N270/N280/some Z and E models) - launched in the final month of WinMe official support, relatively few instructions and notably the last MMX instructions. Industrial PCs supporting Core 2 and ISA slots (B65, Q35, 945GC). There is also Vortex86EX2 (32-bit CPU with SSSE3 and ISA slots, launched in 2018 and available at least until 2028) Interesting for purportedly alleviating register pressure will be eventual update to Experimental type with SSSE3 + 3DNow! (non-exiting combination), although with lesser SSE implementations it can be seen also in Ahtlon XP/64. SSSE3 (last MMX, last instructions added in 32-bit only CPUs) requiring software: 32-bit OBS, OEL7.1, Skype for Linux, EDIUS (Win7 x64), other?

The overall aim for emulated instructions (beyond the scope of this enhancement suggestion) can be:

Obviously from the second bullet onwards the use-case is for new DOS/Win9x development and completeness.

https://en.wikipedia.org/wiki/X86_instruction_listings - also describes various undocumented and single-model-specific instructions (e.g. in some 80387 variants).

Torinde commented 2 years ago

I searched in the PCem repository and there is a lot of mentions of the individual instructions and "3DNow". K6-2+ is listed, so at least 3DNow!+ should be there.

I wanted to point more specifically to which code needs to be adopted. Will that help?

3DNOW

SYSCALL and SYSENTER SYSRET - nothing, although there is SYSEXIT

FEMMS PAVGUSB PF2ID PFACC PFADD PFCMPEQ PFCMPGE PFCMPGT PFMAX PFMIN PFMUL PFRCP PFRCPIT1 PFRCPIT2 PFRSQIT1 PFRSQRT PFSUB PFSUBR PI2FD PMULHRW PREFETCH PREFETCHW

K6-2+ (K6-2P) is listed in cpu.h and readme, but I don't find the 3DNow!+ instructions: PF2IW - nothing PFNACC - nothing PFPNACC - nothing PI2FW - nothing PSWAPD - nothing Geode GX/LX aren't listed PFRSQRTV - nothing PFRCPV - nothing

Cyrix 6x86MX is listed in cpu.h and readme, but the EMMI and Cyrix-specific x87 instructions don't appear: PAVEB - nothing PADDSIW - nothing PMAGW - nothing PDISTIB - nothing PSUBSIW - nothing PMVZB - nothing PMULHRW - appears, but probably that's code for the 3DNow instruction with the same mnemonic. Reading the 3DNow manual and Cyrix application note 108 seems like the calculation is quite similar, but opcodes are different? PMVNZB - nothing PMVLZB - nothing PMVGEZB - nothing PMULHRIW - nothing PMACHRIW - nothing

FTSTP - nothing FRINT2 - nothing FRICHOP - nothing FRINEAR - nothing

Torinde commented 2 years ago

From QEMU patches

Torinde commented 2 years ago

From Bochs: 3dnow.cc Notes:

SSE2, SSE3, SSSE3 (and more modern ones) are also supported

joncampbell123 commented 2 years ago

It's not happening right away, but I updated the MMX support code so that the 64-bit register can hold two 32-bit floats as preparation for future 3DNow! emulation.

Ref: https://softpixel.com/~cwright/programming/simd/3dn.php

Torinde commented 2 years ago

Great to hear! From 86box: SYSCALL, SYSRET

fuel-pcbox commented 2 years ago

From what I can tell, it's simply impossible to use AVX unless you're in 64-bit mode, as it reuses a single-byte opcode as a prefix for VEX-encoded instructions.

Torinde commented 2 years ago

From what I can tell, it's simply impossible to use AVX unless you're in 64-bit mode, as it reuses a single-byte opcode as a prefix for VEX-encoded instructions.

No experience myself, but what I found is in my first comment above:

fuel-pcbox commented 2 years ago

That first thread is literally only showing that it's possible to save and restore the state of AVX registers under DOS. That doesn't mean it's possible to use AVX under DOS. lol

Torinde commented 2 years ago

Intel AVX instructions will be available on both 32bit and 64bit flavors of Processor and OS.:

AVX will require OS to have 256-bit YMM state support. Once the OS adds the necessary system level support, it can decide to create 64-bit distribution and 32-bit distributions. A 32-bit OS distribution with the required state support will allow software to use AVX.

So, combining this statement with the above thread on state save/restore - it seems you can use AVX in DOS without 64-bit mode?

fuel-pcbox commented 2 years ago

Weird, because AVX specifically requires the VEX encoding, and that's impossible to use in 32-bit mode.

Torinde commented 2 years ago

Per Wikipedia you can use VEX in 32-bit mode with the following restrictions:

In 32-bit mode VEX encoded instructions can only access the first 8 YMM/XMM registers; the encodings for the other registers would be interpreted as the legacy LDS and LES instructions that are not supported in 64-bit mode. The VEX prefix's initial-byte values, 0xC4 and 0xC5, are the same as the opcodes of the LDS and LES instructions. Not supported in 64-bit mode, the ambiguity is resolved in 32-bit mode by exploiting the fact that a legal LDS or LES's ModRM byte can not specify a register operand; i.e., be of the form 11xxxxxx. Various bit-fields in the VEX prefix's second byte are inverted to ensure that the byte is always of this form. Similarly, the REX prefix's one-byte form has the four high-order bits set to four, which replaces sixteen opcodes numbered 0x40–0x4F. Previously, those opcodes were individual INC and DEC instructions for the eight standard processor registers; x86-64 code must use ModR/M INC and DEC instructions. [2]

Can you test that? What about AVX-512/EVEX? Here is shown that AVX-512 can be used in 32-bit code.

Torinde commented 2 years ago

Some 32-bit only CPUs with relatively low performance (thus realistic to emulate), but supporting advanced instruction sets:

From AMD the slowest I found:

Along with PAE/PSE those will also bring closer the possibility to run as guest Win8/10 and Win7 May 2018 KB4103718 (e.g. all 32-bit Windows)

also, if that helps: most of the above are derivatives of the already supported P6: Pentium M, A100, Core Solo - just with:

PCBox/PCBox/issues/41

Torinde commented 1 year ago

For the NEC Vxx emulation - it seems 86box 3.11 adds support for its 8080 mode, I'm not sure if that's already part of DOSbox-X PC-98 mode or not.

Torinde commented 1 year ago

@finalpatch, as I see you're aware of FPU emulation - just wondering if you're willing to make a PR for 3DNow!+? If that helps: 86box recently added the missing 3DNow!+ instructions (and now supports all except the two "Pro" instructions from Geode), there is also a QEMU 3Dnow.diff file link above.

Torinde commented 1 year ago

@qeeg, JHRobotics/simd95 is a "Simple hack for enabling SSE/AVX instructions on DOS and Windows 95/98"

Torinde commented 7 months ago

FEX-Emu (x86 emulator that can be used by Wine) supports all 3DNow!, including Extended and the Geode specific instructions.