xcellerator / tetsuji

Tetusji - Pokemon Crystal JP Remote Code Execution
MIT License
30 stars 0 forks source link

Research on the effect of the 3F control character #1

Open mid-kid opened 2 years ago

mid-kid commented 2 years ago

I'd like to start by mentioning that I enjoyed your article, it goes into a lot of original research that I haven't seen documented yet, but one thing that bothers me is that the mechanics of the final exploit are never explained, so I decided to research this myself.

Character 3F is <ENEMY> which causes the text parser to call PrintEnemysName to print wOTClassName during link battles. Writing this in wOTClassName causes an infinitely recursive loop when <ENEMY> is parsed.

This loop causes a stack overflow, putting the stack pointer in SRAM. Thankfully, nothing in this loop is reading values from the stack, as it's only pushing and calling. However, once an interrupt triggers, it needs to store the address to return to. Since we can't write to SRAM as it's closed, writes will be ignored, and reads will (usually) all be $FFFF. However, once the interrupt returns through reti, it returns into $D9D9 thanks to open bus behavior (D9 is the reti instruction). Open bus behavior is when a device stops pulling data lines (as is the case here since SRAM is closed), and for a short time the last value is preserved and will be read, in this case D9 from the last instruction read. This jumps lands in a nop sled, until the timer interrupt occurs. The timer interrupt, since the mobile adapter is turned on, will attempt to bankswitch to bank $44, however, in doing so it'll encounter a ret instruction instead, now jumping to $C9C9, which is part of WRAM. This time interrupts are disabled (reti enables them), so it can slide indefinitely along WRAM. With some luck there's no bytes in the way that break the slide, and it'll end up at the controllable data area, $CA4F.

This is a lucky turn of events, since if the timer interrupt happens at a moment when the stack pointer hasn't fully reached SRAM yet, different things can happen depending on what code finally tries reading from the SRAM. One particularly bad example is when a ret instruction reads only the upper half of the address from WRAM, and the lower half from SRAM (with open bus behavior, so always C9), as this will jump to the address $xxC9, where xx is the upper part of the real return address. And that's without mentioning that open bus behavior is hard to predict, and while it might work with an MBC30 in GBC double-speed mode (the few people I've consulted say it would probably be fast enough to read the same value twice), it likely doesn't work with flashcarts and a fair amount of different emulators.

xcellerator commented 2 years ago

Wow! First of all, I'm glad you enjoyed the article - it was a lot of fun and was my first time working on GameBoy stuff to this level.

I agree with you and the $3F character was bugging me to - I was planning to come back at some point and pull on that thread. Your explanation is fantastic and I really appreciate you taking the time to figure this out and write it up here. If it's okay with you, would you mind if I added an update to the write up with your explanation of this behaviour (crediting you of course, and linking back here as well)?

My next step was always to try to get it going on real hardware. I find it especially interesting that it might only work with the actual cartridge, so I'll be sure to try both.

mid-kid commented 2 years ago

Yeah, feel free to use this information as you wish, I'm only putting it here because I find this sort of thing very interesting. If you want to see the effects for yourself to be able to explain better, I suggest using the BGB debugger and breakpoint on PlaceString ($1057 in JP) right before the exploit happens, watching the de (source string) value and the stack view, and just step through the code as it's just about to overflow (using F7 to see jumps to interrupts as well).

Testing on real hardware is definitely something that ought to be done, especially considering the variables (time to discharge of the data lines) we're dealing with, I'd love to know if there's a flashcart that can emulate this correctly somehow. Might try it sometime myself if I get to it. I wrote a barebones arduino emulator that might help here, though it's old and using a very old version of the backend library, so I'm not sure it's super useful in its current state, it's been a hot minute since I last messed with it.