joncampbell123 / dosbox-x

DOSBox-X fork of the DOSBox project
GNU General Public License v2.0
2.69k stars 381 forks source link

Games glitches and emulated system performance in Windows 95 #2198

Open dodleh opened 3 years ago

dodleh commented 3 years ago

I have successfully made a 3dfx Windows 95 OSR2 installation with Voodoo 1 May 1999 Voodoo Graphics Driver kit Version 3.01.00 and active emulation (using MiniGW Dosbox-x).

First, some simple issues. I found that the windows 95 guide is slightly wrong for some settings pertaining to performance. https://dosbox-x.com/wiki/Guide%3AInstalling-Windows-95

When using core=normal and cpu type=pentium_mmx the resulting system is very, very slow (70Mhz Intel Pentium emulated CPU on a Core2Duo 2.40Ghz system). When using core=auto cpu type=auto, the resulting system is very fast (1000Mhz ahem! :), virtual emulated CPU which translates to around 166Mhz Intel Pentium). The speed difference is noticeable. Compatibility does not seem to be hurt that much. A side note, using aspect=true and direct3d as renderer, with scaler=none, improves the maximum emulated speed and responsivness (lag) by around 100%, in my case. I suggest these settings to be incorporated in the wiki guide. Also, i think that 60000 cycles/ms is quite a low target value for an emulated CPU (50Mhz Intel Pentium equivalent). I think that a good guide should include values for at least some popular CPU types (Intel Pentium MMX 166MHz/133Mhz/100Mhz/75Mhz). Of course, the sky is the limit but I think that most users will see what they can emulate and a 166Mhz cpu covers quite a decent share of mid to late 90s games selection.

Now the glitches in Windows games. I noticed that Ignition works quite reasonably while Need for Speed 2 SE crashes a short while after entering a race, returning to the windows desktop. I am not sure what happens with NFS2. In glide mode Ignition works, albeit slowly, while NFS2 runs slowly, shows heavy artifacts with geometry/texture handling and it freezes suddenly, taking the whole system with it.

Now, some remarks. It seems that other PC emulators (PC-Em, 86Box) can use threading to help with 3dfx emulation. As far as I can tell, it does not work right since more than 1 thread increases the risk of erratic framerate due to thread synchronization issues and variable scene rendering time. The dosbox Voodoo emulation is potentially more accurate (texture and geometry wise), but, as far as I can tell, very slow. I do not know if there is something that could be done in Dosbox to speed things up.

rderooy commented 3 years ago

cputype=auto is effectively a 486, which is fine for early Win9x games, but later games require P5 or P5MMX.

In any case, there is work ongoing to improve the dynamic core and mmx emulation. Hopefully this can be merged before the next release at the end of the month. In which case there will be a big improvement in usability of the Win9x emulation.

koolkdev commented 3 years ago

The reason that "core=auto" is much faster because it switches to dynamic core in protected mode. While it is faster, it isn't very stable with windows 95/98, there are a lot of crashes and blue screens. I am working on a patch to improve its stability (and on the way this patch improves mmx performance). You can try the experimental build here: https://github.com/joncampbell123/dosbox-x/pull/2182#issuecomment-761551745

But this branch doesn't fix any compatibility issue that was already in normal core (for example it doesn't fix the crash in NFS2)

dodleh commented 3 years ago

Thank you both, i appreciate the details. Yes, I know that mmx instructions may be required for some games, it is entirely true. It is a difficult balance to chose a CPU architecture to emulate. The only problem is that pentium_mmx emulation speed seems to be much slower than on other emulators (pcem, for instance). Retrospectively, 486 emulation was solid in Dosbox, hence my choice of a somewhat well rounded early to mid nineties machine.

A very good remark, kooldev. It is true that auto switches to dynamic core. However, it seems that even without dynamic core (I assume it is dynamic recompilation), the emulation seems slow. I know, it is not only in this version of Dosbox that problems arise but normal or simple core does not bring too much performance (as it used to do, some years ago, roughly in the case of the first Dosbox SVN derivative, I hope I do not remember it wrong). I will try right away your patch, and I sincerely appreciate your effort.

cputype=auto is effectively a 486, which is fine for early Win9x games, but later games require P5 or P5MMX.

In any case, there is work ongoing to improve the dynamic core and mmx emulation. Hopefully this can be merged before the next release at the end of the month. In which case there will be a big improvement in usability of the Win9x emulation.

Yes, it would be great if the fix can be merged in the main branch. I am sure that even up to this moment there was a lot of work. I may be picky but, hopefully, i can offer some insight. I made some comparisons with a real 486 system (DX4/75) and there were some interesting insights. Subjectively, the 486 system (with a Chips&Tech GPU) feels a lot smoother and more consistent in performance even if benchmark numbers are lower. Emulating a DX4/75, especially in Windows, shows that there is some input lag. I am not entirely sure if it pertains to interrupts and/or handling of them or general processing of inputs from mouse and keyboard cannot be made at higher polling rates to bring better smoothness on the emulated machine.

You can try the experimental build here:

I tried the experimental build. Overall, the performance seems the same with core=auto and cpu_type=pentium_mmx. I cannot think of edge case scenarios to crash the current cpu. I do not consider the Rain app test to be relevant, even if it still crashes windows with a BSOD that is recoverable. Opening a command prompt in Windows 95 still results in an unrecoverable system freeze.

I noticed something else, for a long time, pertaining not only to the experimental build but to Dosbox SVN derivatives in general (ECE and Dosbox-X). The reported MIPS performance in Test CPU is way, way over what would be expected for a typical 70-75Mhz Intel Pentium CPU. It is not the only metric that is skewed but never occurs on a real system (the same problem goes for Whetstone score, albeit less severly). I am thinking what might be wrong, I mean resources over-allocated to CPU emulation that could be diverted towards better responseivness (mouse polling if i may say). As far as I can tell GPU test (TACH in windows 3.1) hints at faster graphics than a typical S3 864 was drawing with such a run-of-the-mill 70-75Mhz Pentium CPU.

rderooy commented 3 years ago

I did some WinBench96 runs on Win98SE with cputype=auto (486 basically), cputype=pentium and cputype=pentium_mmx.

This is with the latest dosbox-x code, with core=dynamic_x86 and cycles=max on an old 4th Gen Intel Core CPU.

There are some peaks for the P5MMX in some of the Graphics WinMark96 runs, but other than that the results are very close between CPU types.

guest os_024 guest os_025 guest os_026 guest os_027 guest os_028 guest os_029

dodleh commented 3 years ago

Your benchmarks are very useful, however, they may not show details that are very specific.

Back in the day, I used Sandra 99 and TestCPU for very good, subtle tests because they could point straight away at configuration problems. In our case, FPU and ALU performance, latencies, are all important.

For latencies and other tests we can also use Lavalys Everst (it runs on a Windows 98 machine). The latency test is the most accurate I ever saw, being very good at showing even minute dram timing changes.

dodleh commented 3 years ago

core=auto, cputype=pentium_mmx, cycles=max

As you can see below, results are all over the place... and I know that this app is very accurate (never failed with such large inconsistencies). On a side note, no emulator upto now is able to conform consistently to a specific CPU performance (I mean, for instance, be roughly equal to a Pentium MMX 166). There are also some visible graphical glitches that occur with this app when switching back-and-forth to paint and pasting screenshots while in the emulator. TestCPU Calculations Main TestCPU MOV TestCPU MOVSD TestCPU Calculations DHRY TestCPU Calculations WHET TestCPU Calculations MIPS TestCPU Calculations MFLOPS

dodleh commented 3 years ago

I will get into a bit rambling, so please ignore if you do not fancy the attitude... ;)

I do not clearly remember Winbench96 but now I understand why I missed it or not taken it seriously (back then or more recently).

It has a doubtful benchmarking approach, not far from the typical review style of the late 80s, early 90s. I mean: run some simple automated stuff, pretend to show valuable information, dazzle with some values that do not say much, all to be used in an review where you brag about how the pc/laptop costs, looks and feels, nothing of technical substance.

The application... Easy to run, shows values, seems complex, hides well all sorts of technicalities (that matter much more to the trained eye or serious tester). In reality, Winbench is more of a stress-test with a side-not(e) of benchmark.

The Graphics test is very repetitive, with minimal differences. Should I fancy a guess that it was developed as a quick approach to reveal crappy VGA drivers (as was the style back in the day). Trident/Cirrus, anyone? Note: I am still surprised that no one had the poor idea of including graphics/text scrolling performance, who would need that, we all just use our computer to layer text and shapes on the screen... I know, certain Trident/Cirrus cards might be offended...

The CPU bench takes a huge amount of time (20 minutes, are you serious?) If you ran an integer test iteratively and some FPU calculations quicker you would say much more about your system but if you want to hide the crappy FPU of a Cyrix CPU, so be it!

The only saving grace of the 16/32bit CPU test would be that it might (I will not hold any high hopes!) nail the poor Pentium Pro 16-bit performance, if you are very, very patient.

It is funny how the HDD benchmark does not even point to metrics somewhat known back in the early nineties such as access time or read performance. A wild guess would be that it was hiding pretty well the mediocre to poor performance of some laptop/desktop drives (a certain Seagate <> 1.7GB model comes to mind).