irixxxx / picodrive

Fast MegaDrive/MegaCD/32X emulator
Other
55 stars 25 forks source link

Broken SH2 dynarec on x86 builds (in newer gcc?) #27

Closed davidgfnet closed 3 years ago

davidgfnet commented 3 years ago

Turns out there is a bug... and has been there forever hidden in the gcc internals. As you can see the sh2_read_handler type lacks the regparm decorator, so in functions p32x_sh2_read8/16/32, the arguments are passed via stack. The functions expect the arguments to be passed via regs, so this would fail... Unless we get lucky and gcc happens to store the values in the same regs, out of chance!

And this is indeed what happens, just download the latest nightly binary and objdump'ed it to find:

000722b8 <p32x_sh2_read8@@Base>:
   722b8:       56                      push   %esi
   722b9:       89 c1                   mov    %eax,%ecx
[...]
   722e3:       8d 34 1b                lea    (%ebx,%ebx,1),%esi
   722e6:       52                      push   %edx
   722e7:       50                      push   %eax
   722e8:       ff d6                   call   *%esi
   722ea:       83 c4 10                add    $0x10,%esp
   722ed:       83 c4 04                add    $0x4,%esp
   722f0:       5b                      pop    %ebx
   722f1:       5e                      pop    %esi
   722f2:       c3                      ret
   722f3:       90                      nop

The regs eax and edx are pushed to the stack as arguments, but the function will ignore them and use eax, edx as args, lucky! In my machine, using gcc 11 instead of gcc 9 I'm not that lucky at all:

00070810 <p32x_sh2_read8>:
   70810:       56                      push   %esi
   70811:       89 d1                   mov    %edx,%ecx
[...]
   70840:       83 ec 08                sub    $0x8,%esp
   70843:       8d 34 1b                lea    (%ebx,%ebx,1),%esi
   70846:       51                      push   %ecx
   70847:       50                      push   %eax
   70848:       ff d6                   call   *%esi
   7084a:       83 c4 10                add    $0x10,%esp
   7084d:       83 c4 04                add    $0x4,%esp
   70850:       5b                      pop    %ebx
   70851:       5e                      pop    %esi
   70852:       c3                      ret
   70853:       90                      nop

As you can see eax is correct (that's why my stack trace shows a correct reference to SH2* and doesn't really crash the emulator) but the address is passed via ecx (well, via stack, but it is also allocated in ecx) so it just reads whatever garbage edx has and doesn't work.

But wait a sec David, this doesn't really matter since we call the memory handlers from the DRC code, which knows how to use the right calling convention! And yeah you are right, except there's one very important call from C side (well actually a couple more): in sh2_reset we read the PC value from the reset vector of the CPU, which in this case makes the initial PC be garbage, and fun things happen (it starts from a weird addr and, amazingly, in some games it manages to display something on screen!)

TLDR: fun bug, I though I'd document it cause it's worth an explanation on why it has worked all this time if it's so obviously broken.

irixxxx commented 3 years ago

Thanks for finding this for me. Sure saved some time since it doesn't occur with older gcc versions (*buntu 20 still uses a gcc from 2016!).