xemu-project / xemu

Original Xbox Emulator for Windows, macOS, and Linux (Active Development)
https://xemu.app
Other
2.77k stars 278 forks source link

PGRAPH dirty surface updates corrupting memory #1471

Open LoveMHz opened 1 year ago

LoveMHz commented 1 year ago

Bug Description

This issue has been observed with Burnout 3, but due to its nature, the issue more than likely affects other titles too.

Notes Shortly after the kernel loads the default.xbe into memory; the game will attempt to load into memory one of its packed resources that overlap with a previous surface buffer.

The IDE DMA transfer can be monitored and traced at the kernel IdexDiskReadWrite of the disk driver object.

0x800245A5: IdexDiskReadWrite
Irp = $esp + 8
Irp->UserBuffer = *(int *)($esp + 8) + 48

0x8002446D: IdexDiskFinishReadWrite
Irp pointer is stored in the global IDE device object. Device Object at 0x8003BD20, and Irp at 0x8003BD40 for 5838.

On success, the buffer at 0x03C1C000 should equal 00 00 3C 54 ..., and the game should boot normally. Note, attempting to read the buffer from gdb will still trigger pgraph_surface_access_callback.

Example of corrupt buffer and normal buffer: image

Expected Behavior

Surface loading/unloading should work in a manner that does not conflict with other hw functions, such as DMA.

xemu Version

0.7.96-4-g18cee9c2e3

System Information

No response

Additional Context

BIOS: m8plus 760a6817566a79cca2d2d733bead29f0bf8347bb

abaire commented 1 year ago

I think this is a test case exhibiting the problem:

  1. Do a nop draw to a surface at address X. For my test I used a 128x128x4 surface format.
  2. Do a CPU blit starting before the surface and extending over half the texture, address Y = X - 128*64*4
  3. Set the surface to some other memory to be used as a standard framebuffer, set Y as a texture, and render a quad.

Since the CPU blit to Y covered the entirety of the memory being used as a texture, you should just see whatever was blitted.

Test

HW results: HW results

xemu:

Screenshot 2023-07-09 at 21 49 07
antangelo commented 1 year ago

Based on a preliminary look at the DMA code, it uses async IO to read straight into RAM from the block device, which seems to bypass the access callback mechanism entirely (but I haven't had time to confirm this with a test case): https://github.com/xemu-project/xemu/blob/master/softmmu/dma-helpers.c#L170

Breaking on dma_blk_cb seems to detail how this process works well, albeit these traces are not exemplary of the bug and are just normal IDE callbacks. It first triggers on an outb to the device:

#0  dma_blk_cb (opaque=opaque@entry=0x7ffebc21ff50, ret=ret@entry=0) at ../softmmu/dma-helpers.c:128
#1  0x0000555555807978 in dma_blk_io
    (ctx=0x7fff4c03b640, sg=sg@entry=0x7fff4e4943c0, offset=offset@entry=2884108288, align=align@entry=512, io_func=io_func@entry=0x555555807621 <dma_blk_read_io_func>, io_func_opaque=io_func_
opaque@entry=0x7fff4cb42a50, cb=0x5555558a9b44 <ide_dma_cb>, opaque=0x7fff4e494098, dir=DMA_DIRECTION_FROM_DEVICE) at ../softmmu/dma-helpers.c:255
#2  0x00005555558079d4 in dma_blk_read
    (blk=0x7fff4cb42a50, sg=sg@entry=0x7fff4e4943c0, offset=offset@entry=2884108288, align=align@entry=512, cb=cb@entry=0x5555558a9b44 <ide_dma_cb>, opaque=opaque@entry=0x7fff4e494098)
    at ../softmmu/dma-helpers.c:273
#3  0x00005555558a9e15 in ide_dma_cb (opaque=0x7fff4e494098, ret=ret@entry=0) at ../hw/ide/core.c:943
#4  0x00005555558add46 in bmdma_cmd_writeb (bm=bm@entry=0x7fff4e4951f0, val=val@entry=9) at ../hw/ide/pci.c:306
#5  0x00005555558ae1ad in bmdma_write (opaque=0x7fff4e4951f0, addr=<optimized out>, val=9, size=<optimized out>) at ../hw/ide/piix.c:76
#6  0x00005555559dd318 in memory_region_write_accessor (mr=0x7fff4e495350, addr=0, value=<optimized out>, size=1, shift=<optimized out>, mask=<optimized out>, attrs=...)
    at ../softmmu/memory.c:492
#7  0x00005555559d9856 in access_with_adjusted_size
    (addr=addr@entry=0, value=value@entry=0x7fff513fd348, size=size@entry=1, access_size_min=<optimized out>, access_size_max=<optimized out>, access_fn=access_fn@entry=0x5555559dd2ca <memory_
region_write_accessor>, mr=0x7fff4e495350, attrs=...) at ../softmmu/memory.c:554
#8  0x00005555559dcfb9 in memory_region_dispatch_write (mr=mr@entry=0x7fff4e495350, addr=0, data=<optimized out>, data@entry=9, op=op@entry=MO_8, attrs=...) at ../softmmu/memory.c:1504
#9  0x00005555559d60a4 in address_space_stb (as=<optimized out>, addr=<optimized out>, val=9 '\t', attrs=..., result=result@entry=0x0) at ../memory_ldst.c.inc:382
#10 0x0000555555973cf1 in helper_outb (env=<optimized out>, port=<optimized out>, data=<optimized out>) at ../target/i386/tcg/sysemu/misc_helper.c:30
...

Following this it will break again after the asynchronous IO operations are finished, with the interesting logic being in block-backend.c:

#0  dma_blk_cb (opaque=0x7ffebc21ff50, ret=0) at ../softmmu/dma-helpers.c:128
#1  0x0000555555b23312 in blk_aio_complete (acb=acb@entry=0x7ffebc1afd00) at ../block/block-backend.c:1426
#2  0x0000555555b238f0 in blk_aio_read_entry (opaque=0x7ffebc1afd00) at ../block/block-backend.c:1480
#3  0x0000555555bfb08c in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at ../util/coroutine-ucontext.c:173
...

For reference this was guided by the following trace LoveMHz posted in the xemu discord: https://discord.com/channels/680221390359887933/680221390359888154/1126991949027618886

ide_exec_cmd IDE exec cmd: bus 0x7fffe63b9980; state 0x7fffe63b9a08; cmd 0x25
ide_dma_cb IDEState 0x7fffe63b9a08; sector_num=15972248 n=256 cmd=DMA READ
dma_blk_io dbs=0x7fff5833ee90 bs=0x7fffe5700010 offset=8177790976 to_dev=0
dma_blk_cb dbs=0x7fff5833ee90 ret=0
blk_co_preadv blk 0x7fffe5700010 bs 0x7fffe56c60d0 offset 8177790976 bytes 131072 flags 0x0
bdrv_co_preadv_part bs 0x7fffe56c60d0 offset 8177790976 bytes 131072 flags 0x0
bdrv_co_preadv_part bs 0x7fffe5ac25b0 offset 8177790976 bytes 131072 flags 0x0
dma_blk_cb dbs=0x7fff5833ee90 ret=0
dma_complete dbs=0x7fff5833ee90 ret=0 cb=0x5555558a2e6b
ide_ioport_read IDE PIO rd @ 0x1f7 (Status); val 0x50; bus 0x7fffe63b9980 IDEState 0x7fffe63b9a08
   IdexCdRomVirtualReadComplete Context Irp: 0xD0022EA8
IdexCdRomVirtualReadComplete Irp->IoStatus.Information: 00020000
IdexCdRomVirtualReadComplete: Irp->Cancel = 0x00000000
- pgraph_surface_access_callback is triggered.
IdexCdRomVirtualReadComplete: DWORD at 0x83C1C000 = 0xC5011522