Add CoP0 Exception registers

zachary-cauchi commented 4 months ago

To begin work on exception-handling, there would first need to be support for the EE exception registers. These are listed in Chapter 3.2 of the EE Core User's Manual.

Definition of Done:

[x] Add all exception-related CoP0 registers to the CoP0 routines in prussia_rt/src/routines.S.
- [x] BadVAddr
- [x] Count
- [x] Compare
- [x] Status
- [x] Cause
- [x] EPC
- [x] BadPAddr
- [x] Perf
- [x] ErrorEPC

zachary-cauchi commented 4 months ago

Took a closer look at the registers, and if I'm understanding them right, they don't have a specific address that I can use in the SVD file. Is it the case, or are they just from base 0x00?

Ravenslofty commented 4 months ago

I was just about to comment as such; you need to use special MIPS instructions (in this case, mfc0 and mtc0) to read registers from the coprocessors. Take a look at prussia_rt, that has...something useful here at least.

zachary-cauchi commented 4 months ago

I see, so we can't declare these in our SVD. Also checked online and if I understand right, the CoP0 is something quite standard to most/all MIPS devices. All the functionality governing access of the devices is in the routines.S file, which uses the aforementioned instructions to get the values, and are declared after the jump instructions since the jump instructions always take 2 cycles to take effect?

Following this, what would be the best course in your opinion? Should I modify the routines.S file to include read/write operations for all the exception-related registers?

Ravenslofty commented 4 months ago

I see, so we can't declare these in our SVD. Also checked online and if I understand right, the CoP0 is something quite standard to most/all MIPS devices.

Eeeeh, every MIPS processor must have a Cop0, but the exact format is unspecified (until the MIPS{32,64}rN standards, anyway). To be pedantic, this is a MIPS R4000-style Cop0; the IOP - aka the PS1 CPU - is based on the MIPS R3000, which has its own style of Cop0 registers.

All the functionality governing access of the devices is in the routines.S file, which uses the aforementioned instructions to get the values, and are declared after the jump instructions since the jump instructions always take 2 cycles to take effect?

Yes, this is the famous MIPS branch delay slot. That being said, I'm wondering if it's actually correct w.r.t pipeline hazards; something very dumb like sync; m[tf]c0; sync; jr $ra would always be correct, but it would also be somewhat slow. I don't know if that matters; maybe we should consult what ps2sdk does here.

Following this, what would be the best course in your opinion? Should I modify the routines.S file to include read/write operations for all the exception-related registers?

I think that would be the best idea, yes. I think the "level 2 exceptions" (nonstandard MIPS >.>) are going to be a little painful, because Reset and NMI are in practice "please reboot the console", while the performance counters are low priority; but having instruction breakpoint capability is high priority for a nice debugging experience >.>

Ravenslofty commented 4 months ago

(oh, and while you're there, s/PruSSia/Prussia; I thought highlighting the pun in the name would be funny, but instead it just reads like I'm referencing the SS. not my smartest idea.)

zachary-cauchi commented 4 months ago

Non-standard goodness, what more could we want? Okay, I'll begin work on that either tonight or tomorrow afternoon and keep you posted on any hiccups/questions. This would be my first time writing proper assembly (besides in TIS-100 but I'd say that doesn't count) so I will ask you for the occasional review if that's alright. I'll begin with the must-haves and see if I can work my way to the nice-to-haves.

(oh, and while you're there, s/PruSSia/Prussia; I thought highlighting the pun in the name would be funny, but instead it just reads like I'm referencing the SS. not my smartest idea.)

Haha, think of it as personality for the project I suppose.

zachary-cauchi commented 4 months ago

Could you help me out with a problem I'm encountering with reading the cop0 registers using the method in routine.S. I'm basing my implementation on the read/write status functions. Managed to get it building and running in-pcsx2. However, the values showing up in EEOut are 0. I've updated the hello-rs script to showcase a working function (using inline assembly) and the non-working function (using routines.S). Would you happen to have an idea why it's reading 0's? If there's any missing information I can provide, please let me know.

Edit: Surrounding the load instruction sync instructions fixed the problem, so I'm guessing it's something to do with the jump happening before the mfc0 load is finished?

Edit 2: Removed the sync instructions and reordered the instructions so mfc0 is run before jr and that produced the same working result.

Ravenslofty commented 4 months ago

Edit 2: Removed the sync instructions and reordered the instructions so mfc0 is run before jr and that produced the same working result.

Oh, I think I know what the problem is. Try adding .set noreorder at the top of the source file (with the instructions in jr/mfc0 order). MIPS assemblers try to help the programmer by hiding delay slots, which I think results in the assembler turning this code into jr/nop/mfc0, except the mfc0 never executes because of the return.

zachary-cauchi commented 4 months ago

Just tested it on the Count register, looks like it worked and is producing values in the expected pattern! Going to keep the directive at line 1 as suggested and reorder the instructions in all the methods. Thanks a lot for the help! Interesting functionality, though sounds like it would cause more problems than it's worth. Are there any practical use-cases for that feature?

Ravenslofty commented 4 months ago

Are there any practical use-cases for that feature?

So that compilers can blithely ignore the branch delay slot as somebody else's problem. >.>

zachary-cauchi commented 4 months ago

Finalised the last of the exception-related registers and opened #19. When you can, would you please give it a review?

Ravenslofty commented 4 months ago

this reminds me about the fun that is the MIPS TLB; we're going to have to decide what to do there at some point.

zachary-cauchi commented 4 months ago

I'm afraid I'm not familiar with it. From what I tried learning from the docs, it looks to be a cache the MMU uses when translating virtual addresses to physical addresses (or vice versa). Would you please elaborate a little on it?

Ravenslofty commented 4 months ago

Pretty much all processors have TLBs to convert addresses, and of course they can't be of infinite size, so when you access a memory region which doesn't have an entry, you need to fetch it.

On x86 and ARM, the CPU will fetch the TLB entry for you, requiring you to structure your page tables in a specific way or the CPU fetches garbage.

On early MIPS, they didn't want to implement the hardware for that, so instead the CPU raises a TLB Refill exception, and expects software to insert the relevant entry, either overwriting a specific entry (tlbwi) or a random one (tlbwr). You'll note that TLB Refill has its own dedicated exception point; the idea is that you can just about fit a refill routine into the 0x80 bytes available; additionally, the entries are structured to make it about as easy as possible to do so.

Remember how MIPS has memory segments? Those segments tell the CPU whether to use the TLB or directly map virtual to physical address; useg (bit 31 clear) uses the TLB, kseg0/kseg1 (bit 31 set, bit 30 clear) is directly mapped to the first 512MB of address space, but kseg2/kseg3 (bit 31 set, bit 30 set) uses the TLB. (These segments also control caching; kseg1 is never cached, but the others are.) This is why I access the hardware registers offset by 0xA000_0000, which places the address in kseg1, which guarantees access regardless of TLB state.

There is just one problem with the TLBs: emulators hate it when you use them, because you're shuffling memory around underneath their feet, and they have to invalidate internal caching. (Or otherwise assume that your game never touches the TLB and behave incorrectly)

zachary-cauchi commented 4 months ago

I see, thanks for the crash course. So the decision you mentioned earlier regarding what to do with them would be whether to add support for it, add support at a much later stage, or ignore it outright? What options are there and their pros?

Ravenslofty commented 4 months ago

I see two big advantages of having the infrastructure for proper virtual memory:

graceful memory fragmentation handling. Imagine we have no TLB: if the user tries to allocate memory, say 512KiB for something huge, but there isn't a slice of contiguous memory that large, malloc must fail, and since Rust APIs generally panic on that condition, the code grinds to a halt. With the TLB, we can take fragmented pieces of RAM and stitch them together into a contiguous block of memory, which lets malloc succeed here. Sure, there are ways of managing memory which do not use a heap, but writing in that style of code raises the barrier of entry to using Prussia.
virtual memory. while it's true that something like memory swap is probably infeasible, you could imagine a game on DVD, where some of the address space maps to data files. When you make a read from it, Prussia has a backing cache which it can serve reads from, or issues a request behind the scenes to the drive to fetch it.

Personally, I can see memory fragmentation being quite annoying to people, because after a while of running, eventually they hit excessive fragmentation and get an out-of-memory error. You can imagine fun worst-case scenarios like every other 4K block of memory is in use, and allocating 8K fails with 16M of free RAM.

zachary-cauchi commented 4 months ago

That does sound very compelling. So TLB support should be our target then. Shall I create an issue to handle creating basic access routines for it?

Ravenslofty commented 4 months ago

On the one hand, I would highly appreciate that, but the MVP is drawing a triangle, and we shouldn't get too distracted from that. Let's file an issue referencing this discussion so we don't forget about it, but not focus right this minute.

zachary-cauchi commented 4 months ago

Fair enough. I'll create the associated ticket after I finish work, reference the points here, and write up a DoD for it.

Besides that, I guess the next step would be to figure out how to handle the exceptions reported by the now-available registers? Would they be able to work alongside a custom panic handler?

Ravenslofty commented 4 months ago

That seems like a reasonable idea to continue with; at the very least dump Cop0.Cause somewhere.

zachary-cauchi commented 4 months ago

Great. I'll create another ticket for that and link it in the MVP issue. I'll aim for a custom panic handler that dumps the registers to EEOut.

Ravenslofty / prussia

Add CoP0 Exception registers #18