Remove the header from repository: create an open source IPL3

anacierdem commented 3 years ago

We can add a new one once we have a non commercial IPL3.

jago85 commented 3 years ago

What's the reason for this? To prevent copyright issues?

What should we do until we have a new IPL3? Find the current one somewhere else (or in the revision history 😉)? I mean, we cannot create a bootable ROM without the loader.

Is there any work in progress for a new IPL3?

For the UltraPIF I also need a modified version of the loader. I've alway been unsure if I'm allowed to publish this code as it might be copyrighted. (My modifications are only a small part to load the application from the PIF instead of the cart.)

Keep in mind that the PIF compares the checksum calculated on the IPL3 to the checksum stored in the CIC.

I have reverse engineered that checksum algorithm from the PIF ROM. (https://github.com/jago85/PifChecksum) But we'd need a reversed algorithm. The new IPL3 needs to be modified that it gives the same checksum or the PIF will lock out the system.

With the UltraPIF I don't have this problem, because I'm not checking this. But I need the loader as it initializes the RDRAM stuff (where I copy the UltraPIF menu to). I started reverse engineering the RDAM initialization but lost motivation halfway through. :laughing:

rasky commented 3 years ago

The idea is that closing this issue requires an open source IPL3.

There is a stub here: https://github.com/hcs64/boot_stub

This already passes the checksum check in IPL2 (thanks a GPU-accelerated cracker that was written, linked from that repo).

The complex part is writing an open source IPL3 that does something similar/compatible to the official ones, while being clean-room implemented. The main task in there is RDRAM initialization, which is a complex system which has been already partially reversed and documented.

From an implantation standpoint, it is hard to debug what happens while the RAM is not working. My plan would be to implement a printf primitive in assembly which goes through 64drive USB (similar to what debug.c does).

anacierdem commented 3 years ago

Yes it needs significant effort to implement a custom clean room IPL3. Until then, it will live for us for some while. As it is one of the major setbacks for a fully free sdk, this will need collaboration from the whole homebrew community. I'm pretty sure someone has asm for printing stuff on screen at hand, that will be useful indeed.

jago85 commented 3 years ago

Maybe I can help someday. With the UltraPIF a can inject any binary to SP_DMEM to be executed without any checks. I can also provide a simple way to send data/characters to a terminal via the UltraPIF's UART using IO writes to the PIFRAM.

bsmiles32 commented 2 years ago

I'd like to contribute toward this goal. For some background, I'm one of the main contributor to mupen64plus and I've fully reverse-engineered the IPL3 especially the RDRAM initialization part. For proof, mupen64plus is the only emulator which implements enough of the RI/RDRAM subsystem to not bypass the current calibration process [1] (this PR dates back to 2017). I already have a partial C reimplementation of the IPL3 (and PIFBootROM for that matter), but I haven't published it, because it comes from reverse engineering the IPL3 (+PIFBotROM) blobs [and my hash collision generator wasn't fast enough to find a collision in a reasonable amount of time, so it wouldn't have been useful].

What I'd like to know is, what would be the best way to contribute toward a fully free IPL3 reimplementation, knowing that I can provide either my existing reimplementation (but it wouldn't fit with clean-room reimplementation from documentation criteria) or should I provide documentation so that other can reimplement it. If that's documentation, how should I share it (N64brew wiki ? or some readme ?) and what level of detail should I expose ? I mean, because my knowledge comes from reverse engineering, I've only been exposed to a specific implementation, so that's what I can document, but without more experiment (which I won't have time to do), I can't extrapolate to document how the hardware works, but how IPL3 choose to use/init the hardware.

Well, what do you think, and how can I help ?

[1] https://github.com/mupen64plus/mupen64plus-core/commit/28f7c868c1d79c7a0f5b15f1d27a89bda617ac7e#diff-f2fa8ba9eb8c629d572444f46462cda0d33984ac21c5d1bd15465481b1296bca

rasky commented 2 years ago

@bsmiles32 this is extremely good news. I think the safest clean room approach would be that you document your findings wrt RDRAM initialization here (new page): https://n64brew.dev/wiki/RDRAM_Interface

In theory, the documentation should provide a reference of the registers, and a description of the initiation / calibration process. You can go in details such as describing in words the initialization process that should be performed, mentioning also which registers to write / read at each step and with what value, as long as the rationale is explained and it's not a bare "englishization" of a disassembly, of course.

It's OK that your understanding of the hardware is limited to what you learnt by reversing IPL3. Even if the way you document it is strictly related to what you saw there (rather than coming from a more extensive knowledge), it's totally fine.

Would that work for you? It would help us a lot.

anacierdem commented 2 years ago

@bsmiles32 It would be some form of rephrassembly in plain English I suppose 😀 As long as it does not directly map to the reversed code should not be an issue to say things like "write this value to that register" right @rasky? There is a slight possibility that those values are opinionated but with someone with better hw info we can do a verification and remove anything that is optional.

bsmiles32 commented 2 years ago

FYI, I've started to update the RI wiki page. It's very WIP, but I'll update it as time permit. Feel free to correct or make suggestions there. As you've suggested, I'll try to avoid "just englishizing" disassembly.

rasky commented 2 years ago

@bsmiles32 thank you, I skimmed through it and left a comment in the discussion page.

I assume there's going to be a description of the initialization process at some point. I guess it's about discovery of how many chips, and configuring them, but maybe there's something else.

Also, is there anything else in IPL3 that we should reimplement, beyond initialization RDRAM (and loading the first meg)?

bsmiles32 commented 2 years ago

For now, I will try to expand the documentation of RDRAM modules (registers, addressing, current calibration, ...) in the RDRAM page. Most of this is already at least partially covered in the linked datasheets, but I feel this is important to explain what is important in the Wiki without forcing the reader to go to the datasheets. Also, that way I can put links in the RI page where needed (Current Calibration, RDRAM IdMatch, RDRAM mapping with select, ...). I'm not sure if I should describe the RDRAM initialization procedure in the RI page (because it's the controller) or in the RDRAM page (because that's what is mainly initialiazed)... Well I'll see and we can always move content later if it's more appropriate.

In my perspective, the main task of IPL3 is RDRAM initialization, MIPS cache initialization, and 1st Mbyte of user code loading. Other aspects like integrity checks on user code, is of limited interest for an opensource reimplementation of IPL3. (I can still document them, but they're not mandatory to be reimplemented like they did). Other minor initialization in IPL3 are just conventions, or cleaning boot procedure leftover.

For now, I will focus on the RDRAM initialization part, which is the least known, and what blocks a free reimplementation of IPL3.

Also, thanks both for your feedback.

bigbass1997 commented 2 years ago

I'm not sure if I should describe the RDRAM initialization procedure in the RI page (because it's the controller) or in the RDRAM page (because that's what is mainly initialiazed)... Well I'll see and we can always move content later if it's more appropriate.

I suspect it would be best to place this procedure description in the RDRAM page, at least for the time being. Like you said, that is what is being initialized, not the RI. I know as the RI is the controller, it does have control over the operation of the chips, but the initialization specifically, describes what's happening to the chips (as far as I understand).

Other aspects like integrity checks on user code, is of limited interest for an opensource reimplementation of IPL3. (I can still document them, but they're not mandatory to be reimplemented like they did).

As far as documentation goes, I would very much like to see a detailed explanation of the entire IPL3, no matter how boring or insignificant it might be. Ideally, both what is happening, and how it's being performed (e.g. writes to certain locations/registers, what values are involved, etc). I had started documenting IPL1/2 already (https://n64brew.dev/wiki/Initial_Program_Load). IPL1 is done, as it's rather short, and I have a lot of notes on IPL2 that I haven't put there yet.

Edit: To clarify, I do not want a disassembly of IPL3 posted online at this time. And my comments here are directed toward documentation of the IPL, not specifically about libdragon.

Also, thank you bsmiles for using the same templates on the RI page as I made for the other interfaces, for documenting the memory-mapped registers. I've been meaning to get to the RI and PI pages (mainly just for the register docs), but have been busy with other projects. There's a channel on the N64brew discord for wiki discussions, if you're interested (https://discord.gg/WqFgNWf). Thanks for your contributions to the wiki!

rasky commented 2 years ago

@bsmiles32 hi! Did you finish your work on the wiki? Are you still going to write more?

bsmiles32 commented 2 years ago

Not finished yet ! But finding the time to write is difficult for me. Will continue to write some more in the coming weeks. Feel free to ask for clarifications, or topics I should cover if they're missing. The hardest part to explain is the current calibration, because the implementation does stuff I don't fully understand, there may be some rationale for this but I'm not sure I understand.

rasky commented 8 months ago

The open source IPL3 is now reality in the preview branch: https://github.com/DragonMinded/libdragon/tree/preview/boot

Scroll down to the README for more information, and feel free to inspect the source code.

Many thanks to all the people that helped towards this ambitious goals: bsmiles32, Thar0, Korgeaux, phire, devwizard, Jhynjhiruu, HailToDodongo, and I'm sure I missed someone.

As per our policy, we will close this issue once it's merged to stable. We'll take our time here, we want to make sure it gets wide testing.

bsmiles32 commented 5 months ago

Congratulations for this huge milestone ! I've finally looked a little at the implementation and I've some remarks ;

https://github.com/DragonMinded/libdragon/blob/unstable/boot/rdram.c#L373 Here a RDRAM register is read before current calibration is done which should normally result in a corrupted value being read. Shouldn't this test be deferred after current calibration like around https://github.com/DragonMinded/libdragon/blob/unstable/boot/rdram.c#L391 ?
At 2 locations the "FR" (or bit 12) is written into MODE register. What is this bit ? It's not documented anywhere. Shouldn't it be "X2" (also called CCMult or bit 6 in LittleEndian notation) which is not currently set but should be according to datasheets and original IPL3 ?

rasky commented 5 months ago

https://github.com/DragonMinded/libdragon/blob/unstable/boot/rdram.c#L373 Here a RDRAM register is read before current calibration is done which should normally result in a corrupted value being read. Shouldn't this test be deferred after current calibration like around

The current calibration is need to access the actual memory contents. Chip registers just require the initial RI current calibration (RI_CONFIG = 0x40), and obviously the correct REG_DELAY (for which the wiki documents our findings with respect to using the special MI Repeat Mode, and the rotation of the value being written). After these steps, you can access RDRAM chip regs just fine.

BTW, it's quite easy to experiment in this area with the new IPL3. If you have access to the hardware via a USB flashcart, just stick a debugf() call and dump some register content. Run make, and then run the dev version of IPL3.

For instance, I've just added a print to dump the contents of REG_DEVICE_TYPE just before the REG_MODE read, and I get this (for all chips):

type:   100019B4

which shows that the current read was performed.

https://github.com/DragonMinded/libdragon/blob/unstable/boot/rdram.c#L391 ?

At 2 locations the "FR" (or bit 12) is written into MODE register. What is this bit ? It's not documented anywhere. Shouldn't it be "X2" (also called CCMult or bit 6 in LittleEndian notation) which is not currently set but should be according to datasheets and original IPL3 ?

It's documented in the "Concurrent RDRAM® User Guide" by Rambus, linked in the wiki. After reset, the RDRAM chips are in "suspend power mode", and the FR bit seems needed to get them out of power saving mode.

In that PDF, REG_MODE is a bit different than the wiki (it's actually a subset) and X2 does not exist. I tried thinking with it but it didn't change anything, so I left it out.

bsmiles32 commented 5 months ago

Thanks for the detailed answer.

Regarding the register read before current calibration, this goes against programming recommendation (for instance, LG datasheet p 82 "This calibration process must take place before the controller performs any register or memory reads or any acknowledge responses"). But as evidenced by your tests it doesn't seem to be necessary... so I don't know, maybe the datasheet is wrong (or at least overly cautious), or we don't fall in a case where problem would manifest (eg. for instance, we don't have enough RDRAM module chained together which would make the current calibration necessary). Maybe adding such precision in the wiki and/or in the code as a comment may clarify that.

Regarding the "FR" bit indeed it is mentioned in "Concurrent RDRAM" datasheets, but these "Concurrent RDRAM" are later versions of RDRAM modules. The N64 only uses "Base" RDRAM, which is the first version of such RDRAM modules. You can see that REG_DEVICE_TYPE.version = 0 for the N64 RDRAM modules, but should be 2 for Concurrent RDRAM. So FR bit is not applicable here and should be left to 0. Conversely, X2 bit should be written with 1 according to datasheets and that's what original IPL3 does. Some more details about the effect X2 are given in the toshiba datasheet in the "Iol Calibration" paragraph almost at the end of the datasheet.

rasky commented 5 months ago

Regarding the register read before current calibration, this goes against programming recommendation (for instance, LG datasheet p 82 "This calibration process must take place before the controller performs any register or memory reads or any acknowledge responses"). But as evidenced by your tests it doesn't seem to be necessary... so I don't know, maybe the datasheet is wrong (or at least overly cautious), or we don't fall in a case where problem would manifest (eg. for instance, we don't have enough RDRAM module chained together which would make the current calibration necessary). Maybe adding such precision in the wiki and/or in the code as a comment may clarify that.

I'll notice that the the current calibration process requires reading the current from the REG_MODE register (when in auto mode). If that wasn't possible, then I don't know how we could ever use the auto mode.

Regarding the "FR" bit indeed it is mentioned in "Concurrent RDRAM" datasheets, but these "Concurrent RDRAM" are later versions of RDRAM modules. The N64 only uses "Base" RDRAM, which is the first version of such RDRAM modules. You can see that REG_DEVICE_TYPE.version = 0 for the N64 RDRAM modules, but should be 2 for Concurrent RDRAM. So FR bit is not applicable here and should be left to 0. Conversely, X2 bit should be written with 1 according to datasheets and that's what original IPL3 does. Some more details about the effect X2 are given in the toshiba datasheet in the "Iol Calibration" paragraph almost at the end of the datasheet.

Thanks I'll do some further tests in this regard and report back.

rasky commented 2 months ago

@bsmiles32 The fix for X2/FR just went in here: https://github.com/DragonMinded/libdragon/commit/b5a37d84574c3761fa946da6b5280b65f354ccf1

X2 seems to affect the way the current value is read and thus affects calibration, but we couldn't specifically pinpoint an improvement. Maybe just broader compatibility?

Will be included in next signed binary once we release it. There are also more things that we are committing on the codebase (mostly small fixes or improvements, nothing big). You can have a look at the history of the directory: https://github.com/DragonMinded/libdragon/commits/preview/boot

Thanks for the review and for reporting this bug.

DragonMinded / libdragon

Remove the header from repository: create an open source IPL3 #158