bochs-emu / Bochs

Bochs - Cross Platform x86 Emulator Project
https://bochs.sourceforge.io/
GNU Lesser General Public License v2.1
887 stars 104 forks source link

Hang with SFFT 1.9 drivers #178

Open Vort opened 10 months ago

Vort commented 10 months ago

When I install Sfft1.9.zip drivers for Voodoo 3 on Windows XP and reboot machine, hang happens - only mouse cursor on black background is visible.

Looks like hang happens because of incorrect handling of disabled count_holes feature.

I don't know how to fix it properly, but such hacky change allowed me to boot OS:

diff --git a/bochs/iodev/display/banshee.cc b/bochs/iodev/display/banshee.cc
index a6c4f2d6c..cd84cac23 100644
--- a/bochs/iodev/display/banshee.cc
+++ b/bochs/iodev/display/banshee.cc
@@ -1088,7 +1088,9 @@ void bx_banshee_c::agp_reg_write(Bit8u reg, Bit32u value)
     case cmdBump0:
     case cmdBump1:
       if (value > 0) {
-        BX_ERROR(("cmdBump%d not implemented (value = 0x%04x)", fifo_idx, (Bit16u)value));
+        BX_LOCK(cmdfifo_mutex);
+        v->fbi.cmdfifo[fifo_idx].amin += value * 4;
+        BX_UNLOCK(cmdfifo_mutex);
       }
       break;
     case cmdRdPtrL0:
diff --git a/bochs/iodev/display/voodoo_func.h b/bochs/iodev/display/voodoo_func.h
index 768765f9a..3f119156e 100644
--- a/bochs/iodev/display/voodoo_func.h
+++ b/bochs/iodev/display/voodoo_func.h
@@ -2887,7 +2887,7 @@ void cmdfifo_w(cmdfifo_info *f, Bit32u fbi_offset, Bit32u data)
   BX_LOCK(cmdfifo_mutex);
   *(Bit32u*)(&v->fbi.ram[fbi_offset]) = data;
   /* count holes? */
-  if (f->count_holes) {
+  if (true/*f->count_holes*/) {
     if ((f->holes == 0) && (fbi_offset == (f->amin + 4))) {
       /* in-order, no holes */
       f->amin = f->amax = fbi_offset; 

Also I don't know if my cmdBump change is correct - I implemented it semi-randomly.

Version: 54831068df334989405f16f85117f7f46fef944d

vruppert commented 10 months ago

I'm not familiar with 3D stuff and I don't know how the "count_holes" stuff really works. The "cmdBump" stuff needs to be reviewed.

Vort commented 10 months ago

I'm not familiar with 3D stuff and I don't know how the "count_holes" stuff really works.

If I understand correctly, f->depth++; should be executed no matter if counting is enabled or not. Otherwise there is no way to start FIFO commands execution.

And f->depth++; was present in else branch before 5698c6974411ab9c7878773513788d5531b94ccf (lines 2453..2455). Do not know if it is enough to just put it back.

Also this thing is not much related to 3D, just some specifics of command processing.

Vort commented 10 months ago

Here is another attempt, with more thinking this time.

  1. There are 4 conditional branches, first 2 of them make no change to f->holes, last 2 of them change f->holes.
  2. If f->count_holes is disabled, then f->holes should always be zero, so last two branches should be disabled.
  3. In case if my assumption is wrong and driver tries to make hole while counting is disabled, BX_ERROR should trigger.

My logic may be wrong, but:

  1. Patch below makes no changes for f->count_holes = true case.
  2. f->count_holes = false is broken anyway, so patch should not make situation worse.
diff --git a/bochs/iodev/display/banshee.cc b/bochs/iodev/display/banshee.cc
index a6c4f2d6c..cd84cac23 100644
--- a/bochs/iodev/display/banshee.cc
+++ b/bochs/iodev/display/banshee.cc
@@ -1088,7 +1088,9 @@ void bx_banshee_c::agp_reg_write(Bit8u reg, Bit32u value)
     case cmdBump0:
     case cmdBump1:
       if (value > 0) {
-        BX_ERROR(("cmdBump%d not implemented (value = 0x%04x)", fifo_idx, (Bit16u)value));
+        BX_LOCK(cmdfifo_mutex);
+        v->fbi.cmdfifo[fifo_idx].amin += value * 4;
+        BX_UNLOCK(cmdfifo_mutex);
       }
       break;
     case cmdRdPtrL0:
diff --git a/bochs/iodev/display/voodoo_func.h b/bochs/iodev/display/voodoo_func.h
index 768765f9a..e92183eb4 100644
--- a/bochs/iodev/display/voodoo_func.h
+++ b/bochs/iodev/display/voodoo_func.h
@@ -2886,21 +2886,20 @@ void cmdfifo_w(cmdfifo_info *f, Bit32u fbi_offset, Bit32u data)
 {
   BX_LOCK(cmdfifo_mutex);
   *(Bit32u*)(&v->fbi.ram[fbi_offset]) = data;
-  /* count holes? */
-  if (f->count_holes) {
-    if ((f->holes == 0) && (fbi_offset == (f->amin + 4))) {
-      /* in-order, no holes */
-      f->amin = f->amax = fbi_offset;
-      f->depth++;
-    } else if (fbi_offset < f->amin) {
-      /* out-of-order, below the minimum */
-      if (f->holes != 0) {
-        BX_ERROR(("Unexpected CMDFIFO: AMin=0x%08x AMax=0x%08x Holes=%d WroteTo:0x%08x RdPtr:0x%08x",
-                  f->amin, f->amax, f->holes, fbi_offset, f->rdptr));
-      }
-      f->amin = f->amax = fbi_offset;
-      f->depth++;
-    } else if (fbi_offset < f->amax) {
+  if ((f->holes == 0) && (fbi_offset == (f->amin + 4))) {
+    /* in-order, no holes */
+    f->amin = f->amax = fbi_offset;
+    f->depth++;
+  } else if (fbi_offset < f->amin) {
+    /* out-of-order, below the minimum */
+    if (f->holes != 0) {
+      BX_ERROR(("Unexpected CMDFIFO: AMin=0x%08x AMax=0x%08x Holes=%d WroteTo:0x%08x RdPtr:0x%08x",
+                f->amin, f->amax, f->holes, fbi_offset, f->rdptr));
+    }
+    f->amin = f->amax = fbi_offset;
+    f->depth++;
+  } else if (f->count_holes) {
+    if (fbi_offset < f->amax) {
       /* out-of-order, but within the min-max range */
       f->holes--;
       if (f->holes == 0) {
@@ -2913,6 +2912,10 @@ void cmdfifo_w(cmdfifo_info *f, Bit32u fbi_offset, Bit32u data)
       f->amax = fbi_offset;
     }
   }
+  else {
+    BX_ERROR(("Unexpected CMDFIFO: AMin=0x%08x AMax=0x%08x WroteTo:0x%08x RdPtr:0x%08x",
+              f->amin, f->amax, fbi_offset, f->rdptr));
+  }
   if (f->depth_needed == BX_MAX_BIT32U) {
     f->depth_needed = cmdfifo_calc_depth_needed(f);
   } 
Vort commented 9 months ago

There is one more problem with these drivers: Fast resolution change often leads to crash of either Windows XP or Bochs. Most likely, my patch above should be refined further.

Here is test program (modeset.zip) which make 5 resolution changes between 800x600@16 and 1024x768@32:

#define WIN32_LEAN_AND_MEAN
#include <windows.h>

int ChangeResolution(int width, int height, int bpp)
{
    DEVMODE dm;
    memset(&dm, 0, sizeof(dm));
    dm.dmSize = sizeof(dm);
    dm.dmFields = DM_BITSPERPEL | DM_PELSWIDTH | DM_PELSHEIGHT;
    dm.dmPelsWidth = width;
    dm.dmPelsHeight = height;
    dm.dmBitsPerPel = bpp;
    return ChangeDisplaySettings(&dm, 0);
}

int WINAPI WinMain(HINSTANCE, HINSTANCE, LPSTR, int)
{
    for (int i = 0; i < 5; i++)
    {
        ChangeResolution(800, 600, 16);
        Sleep(100);
        ChangeResolution(1024, 768, 32);
        Sleep(100);
    }
    return 0;
}
Vort commented 9 months ago

Yeah, my previous code wasn't right. cmdBumps make no sense with automatic updates of amin. So I think disabled count_holes means that amin should be updated by driver, not by hardware. There is a hint in documentation about it: "cmdBump0 defines the number of words to increment the amin pointer by, when managed by software". Here is my new attempt at fixing problems with this driver:

diff --git a/bochs/iodev/display/banshee.cc b/bochs/iodev/display/banshee.cc
index a91ed4c00..061250260 100644
--- a/bochs/iodev/display/banshee.cc
+++ b/bochs/iodev/display/banshee.cc
@@ -1108,7 +1108,9 @@ void bx_banshee_c::agp_reg_write(Bit8u reg, Bit32u value)
     case cmdBump0:
     case cmdBump1:
       if (value > 0) {
-        BX_ERROR(("cmdBump%d not implemented (value = 0x%04x)", fifo_idx, (Bit16u)value));
+        BX_LOCK(cmdfifo_mutex);
+        v->fbi.cmdfifo[fifo_idx].amin += value * 4;
+        BX_UNLOCK(cmdfifo_mutex);
       }
       break;
     case cmdRdPtrL0:
diff --git a/bochs/iodev/display/voodoo_func.h b/bochs/iodev/display/voodoo_func.h
index 141618903..b0fbb514d 100644
--- a/bochs/iodev/display/voodoo_func.h
+++ b/bochs/iodev/display/voodoo_func.h
@@ -2915,6 +2915,10 @@ void cmdfifo_w(cmdfifo_info *f, Bit32u fbi_offset, Bit32u data)
       f->amax = fbi_offset;
     }
   }
+  else {
+    f->amax = fbi_offset;
+    f->depth++;
+  }
   if (f->depth_needed == BX_MAX_BIT32U) {
     f->depth_needed = cmdfifo_calc_depth_needed(f);
   }
@@ -2943,7 +2947,7 @@ Bit32u cmdfifo_r(cmdfifo_info *f)

 void cmdfifo_process(cmdfifo_info *f)
 {
-  Bit32u command, data, mask, nwords, regaddr;
+  Bit32u command, data, mask, nwords, regaddr, prev_rdptr;
   Bit8u type, code, nvertex, smode, disbytes;
   bool inc, pcolor;
   voodoo_reg reg;
@@ -2959,7 +2963,9 @@ void cmdfifo_process(cmdfifo_info *f)
         case 0: // NOP
           break;
         case 3: // JMP
+          prev_rdptr = f->rdptr;
           f->rdptr = (command >> 4) & 0xfffffc;
+          f->amin -= prev_rdptr - f->rdptr;
           if (f->count_holes) {
             BX_DEBUG(("cmdfifo_process(): JMP 0x%08x", f->rdptr));
           } 

Some problems remain however (crash with blt_rectangle_fill for example), so more testing and thinking is needed.

upd. if (!f->count_holes) should be added before f->amin -= prev_rdptr - f->rdptr;. But it is possible to debug problem with crash even without this change. upd2. Here is commit with addition mentioned above: https://github.com/Vort/Bochs/commit/c3198780f95f26436740535cf4a2e6b36f678051.

Vort commented 9 months ago

Looks like blt_rectangle_fill crash is triggered by race condition in driver itself. I've added BX_ERROR prints to see exactly what is happening: https://github.com/Vort/Bochs/commit/344a3528e7039fe1b33bebb2cf7ce25dad9dc133. And here is how logs looks like when problem appears:

Log lines: ``` 17920180267i[VOODOO] CMDFIFO #0 now disabled 17920469382i[VVGA ] Setting VGA update interval to 21087 (47,4 Hz) 17920469442i[VVGA ] Setting VGA update interval to 18468 (54,1 Hz) 17920469452i[VVGA ] Setting VGA update interval to 27064 (36,9 Hz) 17920469543i[VOODOO] Setting VCLK #3 (pllCtrl0) = 65,028 MHz 17920469543i[VVGA ] Setting VGA update interval to 16658 (60,0 Hz) 17920502173i[VVGA ] switched to 1024 x 768 x 32 @ 60 Hz 17920502173i[WINGUI] dimension update x=1024 y=768 fontheight=0 fontwidth=0 bpp=32 17920502173i[VOODOO] 2D desktop mode enabled 17920505351e[VOODOO] Memory write blt_dstBaseAddr: 0x00000000 17920505441e[VOODOO] AGP write register 0x024 (cmdBaseSize0) value = 0x00000000 17920505460e[VOODOO] AGP write register 0x020 (cmdBaseAddr0) value = 0x00000000 17920505479e[VOODOO] AGP write register 0x02c (cmdRdPtrL0) value = 0x00000000 17920505497e[VOODOO] AGP write register 0x030 (cmdRdPtrH0) value = 0x00000000 17920505518e[VOODOO] AGP write register 0x034 (cmdAMin0) value = 0xfffffffc 17920505537e[VOODOO] AGP write register 0x03c (cmdAMax0) value = 0xfffffffc 17920505555e[VOODOO] AGP write register 0x044 (cmdFifoDepth0) value = 0x00000000 17920505573e[VOODOO] AGP write register 0x048 (cmdHoleCnt0) value = 0x00000000 17920505592e[VOODOO] AGP write register 0x080 (cmdFifoThresh) value = 0x000001e8 17920505619e[VOODOO] AGP write register 0x024 (cmdBaseSize0) value = 0x0000050f 17920505619i[VOODOO] CMDFIFO #0 now enabled 17920505693e[VOODOO] Memory write blt_dstBaseAddr: 0x00010400 17920505821e[VOODOO] CMDFIFO write: AMin=0xfffffffc AMax=0xfffffffc WroteTo:0x00000000 RdPtr:0x00000000 value:0x0000001a 17920505821e[VOODOO] CMDFIFO write: AMin=0xfffffffc AMax=0x00000000 WroteTo:0x00000004 RdPtr:0x00000000 value:0x00000000 17920505821e[VOODOO] CMDFIFO write: AMin=0xfffffffc AMax=0x00000004 WroteTo:0x00000008 RdPtr:0x00000000 value:0x0fef0fff 17920505865e[VOODOO] AGP write register 0x028 (cmdBump0) value = 0x00000003 17920505994e[VOODOO] CMDFIFO write: AMin=0x00000008 AMax=0x00000008 WroteTo:0x0000000c RdPtr:0x00000000 value:0x3c000002 17920505994e[VOODOO] CMDFIFO write: AMin=0x00000008 AMax=0x0000000c WroteTo:0x00000010 RdPtr:0x00000000 value:0x00000000 17920505994e[VOODOO] CMDFIFO write: AMin=0x00000008 AMax=0x00000010 WroteTo:0x00000014 RdPtr:0x00000000 value:0x0fef0400 17920505994e[VOODOO] CMDFIFO write: AMin=0x00000008 AMax=0x00000014 WroteTo:0x00000018 RdPtr:0x00000000 value:0x00000000 17920505994e[VOODOO] CMDFIFO write: AMin=0x00000008 AMax=0x00000018 WroteTo:0x0000001c RdPtr:0x00000000 value:0xcc002105 17920506042e[VOODOO] AGP write register 0x028 (cmdBump0) value = 0x00000005 17920684597e[VOODOO] Memory write blt_dstBaseAddr: 0x00d00000 17920684724e[VOODOO] CMDFIFO write: AMin=0x0000001c AMax=0x0000001c WroteTo:0x00000020 RdPtr:0x00000000 value:0x3c000062 17920684724e[VOODOO] CMDFIFO write: AMin=0x0000001c AMax=0x00000020 WroteTo:0x00000024 RdPtr:0x00000000 value:0x00d00000 17920684724e[VOODOO] CMDFIFO write: AMin=0x0000001c AMax=0x00000024 WroteTo:0x00000028 RdPtr:0x00000000 value:0x00051000 17920684724e[VOODOO] CMDFIFO write: AMin=0x0000001c AMax=0x00000028 WroteTo:0x0000002c RdPtr:0x00000000 value:0x00000000 17920684724e[VOODOO] CMDFIFO write: AMin=0x0000001c AMax=0x0000002c WroteTo:0x00000030 RdPtr:0x00000000 value:0x03000400 17920684724e[VOODOO] CMDFIFO write: AMin=0x0000001c AMax=0x00000030 WroteTo:0x00000034 RdPtr:0x00000000 value:0x00000000 17920684724e[VOODOO] CMDFIFO write: AMin=0x0000001c AMax=0x00000034 WroteTo:0x00000038 RdPtr:0x00000000 value:0xcc002105 17920684779e[VOODOO] AGP write register 0x028 (cmdBump0) value = 0x00000007 17920898693e[VOODOO] CMDFIFO read: AMin=0x00000038 AMax=0x00000038 RdPtr:0x00000000 value:0x0000001a 17920899372e[VOODOO] CMDFIFO read: AMin=0x00000038 AMax=0x00000038 RdPtr:0x00000004 value:0x00000000 17920899482e[VOODOO] CMDFIFO read: AMin=0x00000038 AMax=0x00000038 RdPtr:0x00000008 value:0x0fef0fff 17920899815e[VOODOO] CMDFIFO read: AMin=0x00000038 AMax=0x00000038 RdPtr:0x0000000c value:0x3c000002 17920900074e[VOODOO] CMDFIFO read: AMin=0x00000038 AMax=0x00000038 RdPtr:0x00000010 value:0x00000000 17920900310e[VOODOO] CMDFIFO read: AMin=0x00000038 AMax=0x00000038 RdPtr:0x00000014 value:0x0fef0400 17920900516e[VOODOO] CMDFIFO read: AMin=0x00000038 AMax=0x00000038 RdPtr:0x00000018 value:0x00000000 17920900768e[VOODOO] CMDFIFO read: AMin=0x00000038 AMax=0x00000038 RdPtr:0x0000001c value:0xcc002105 17920901406e[VOODOO] Rectangle fill: 1024 x 4079 ROP0 CC 17920901771e[VOODOO] Skipping bad blt 17920902617e[VOODOO] CMDFIFO read: AMin=0x00000038 AMax=0x00000038 RdPtr:0x00000020 value:0x3c000062 17920903253e[VOODOO] CMDFIFO read: AMin=0x00000038 AMax=0x00000038 RdPtr:0x00000024 value:0x00d00000 17920904167e[VOODOO] cmdfifo write blt_dstBaseAddr: 0x00d00000 17920904521e[VOODOO] CMDFIFO read: AMin=0x00000038 AMax=0x00000038 RdPtr:0x00000028 value:0x00051000 17920905104e[VOODOO] CMDFIFO read: AMin=0x00000038 AMax=0x00000038 RdPtr:0x0000002c value:0x00000000 17920905973e[VOODOO] CMDFIFO read: AMin=0x00000038 AMax=0x00000038 RdPtr:0x00000030 value:0x03000400 17920906822e[VOODOO] CMDFIFO read: AMin=0x00000038 AMax=0x00000038 RdPtr:0x00000034 value:0x00000000 17920908618e[VOODOO] CMDFIFO read: AMin=0x00000038 AMax=0x00000038 RdPtr:0x00000038 value:0xcc002105 17920908948e[VOODOO] Rectangle fill: 1024 x 768 ROP0 CC 17921158908e[VOODOO] CMDFIFO write: AMin=0x00000038 AMax=0x00000038 WroteTo:0x0000003c RdPtr:0x0000003c value:0x00000062 17921158908e[VOODOO] CMDFIFO write: AMin=0x00000038 AMax=0x0000003c WroteTo:0x00000040 RdPtr:0x0000003c value:0x0001d400 17921158908e[VOODOO] CMDFIFO write: AMin=0x00000038 AMax=0x00000040 WroteTo:0x00000044 RdPtr:0x0000003c value:0x000500a0 17921158951e[VOODOO] AGP write register 0x028 (cmdBump0) value = 0x00000003 17921158951e[VOODOO] CMDFIFO read: AMin=0x00000038 AMax=0x00000044 RdPtr:0x0000003c value:0x00000062 17921158951e[VOODOO] CMDFIFO read: AMin=0x00000038 AMax=0x00000044 RdPtr:0x00000040 value:0x0001d400 17921158951e[VOODOO] cmdfifo write blt_dstBaseAddr: 0x0001d400 17921158951e[VOODOO] CMDFIFO read: AMin=0x00000044 AMax=0x00000044 RdPtr:0x00000044 value:0x000500a0 17921159044e[VOODOO] CMDFIFO write: AMin=0x00000044 AMax=0x00000044 WroteTo:0x00000048 RdPtr:0x00000048 value:0x3c0c0002 17921159044e[VOODOO] CMDFIFO write: AMin=0x00000044 AMax=0x00000048 WroteTo:0x0000004c RdPtr:0x00000048 value:0xffffffff 17921159044e[VOODOO] CMDFIFO write: AMin=0x00000044 AMax=0x0000004c WroteTo:0x00000050 RdPtr:0x00000048 value:0xffffffff 17921159044e[VOODOO] CMDFIFO write: AMin=0x00000044 AMax=0x00000050 WroteTo:0x00000054 RdPtr:0x00000048 value:0x00000000 17921159044e[VOODOO] CMDFIFO write: AMin=0x00000044 AMax=0x00000054 WroteTo:0x00000058 RdPtr:0x00000048 value:0x00280028 17921159044e[VOODOO] CMDFIFO write: AMin=0x00000044 AMax=0x00000058 WroteTo:0x0000005c RdPtr:0x00000048 value:0x00000000 17921159076e[VOODOO] CMDFIFO write: AMin=0x00000044 AMax=0x0000005c WroteTo:0x00000060 RdPtr:0x00000048 value:0x00002105 17921159113e[VOODOO] AGP write register 0x028 (cmdBump0) value = 0x00000007 17921159113e[VOODOO] CMDFIFO read: AMin=0x00000044 AMax=0x00000060 RdPtr:0x00000048 value:0x3c0c0002 17921159113e[VOODOO] CMDFIFO read: AMin=0x00000044 AMax=0x00000060 RdPtr:0x0000004c value:0xffffffff 17921159113e[VOODOO] CMDFIFO read: AMin=0x00000044 AMax=0x00000060 RdPtr:0x00000050 value:0xffffffff 17921159113e[VOODOO] CMDFIFO read: AMin=0x00000044 AMax=0x00000060 RdPtr:0x00000054 value:0x00000000 17921159113e[VOODOO] CMDFIFO read: AMin=0x00000044 AMax=0x00000060 RdPtr:0x00000058 value:0x00280028 17921159113e[VOODOO] CMDFIFO read: AMin=0x00000044 AMax=0x00000060 RdPtr:0x0000005c value:0x00000000 17921159113e[VOODOO] CMDFIFO read: AMin=0x00000044 AMax=0x00000060 RdPtr:0x00000060 value:0x00002105 17921159133e[VOODOO] Rectangle fill: 40 x 40 ROP0 00 ```

Most likely, driver expects Memory write blt_dstBaseAddr: 0x00010400 to happen first. Then Rectangle fill: 1024 x 4079 ROP0 CC and only then Memory write blt_dstBaseAddr: 0x00d00000.

But what happens is:

17920505693e[VOODOO] Memory write blt_dstBaseAddr: 0x00010400
...
17920684597e[VOODOO] Memory write blt_dstBaseAddr: 0x00d00000
...
17920900310e[VOODOO] CMDFIFO read: AMin=0x00000038 AMax=0x00000038 RdPtr:0x00000014 value:0x0fef0400
...
17920901406e[VOODOO] Rectangle fill: 1024 x 4079  ROP0 CC

There may be two reasons why driver authors did not noticed this problem:

  1. Real hardware is much faster than virtual one.
  2. Real hardware can't crash and garbage on screen for several milliseconds during mode change can be missed easily.

Whether this problem can be avoided somehow by Bochs or not, I think it worth adding checks to bx_banshee_c::blt_rectangle_fill preventing out of bounds access. @vruppert, what do you think about such idea?