Major rework (WARNING: do not use on unofficial cards)

vpelletier commented 2 years ago

Should hopefully fix #4, #2, #1 .

The included 3D-printable model may help with #3.

On my pi (an old B+) 16MHz SPI is working perfectly. I connected the SPI signals to SPI bus 0 with CS 0, and the INT pin on GPIO 25 just because it is right next to the SPI pins.

Sample output:

$ ./vpy3/bin/python adapter.py \
  --spi /dev/spidev0.0 \
  --gpiochip /dev/gpiochip0 \
  --gpio-int-line 25 \
  -r card.raw
exi_id: 00000010
  card size:        2097152
  turnaround bytes: 4
  sector size:      8192
  sector count:     256.0
status: CardStatus.UNLOCKED|INT_EN|READY
interrupts already enabled
[snip]
serial:    [snip]
time:      [snip]
bias:      [snip]
lang:      0
device ID: 0
size:      16 Mb
encoding:  1
100%|████████████████████| 4096/4096 [00:12<00:00, 331.58it/s]
$ ./vpy3/bin/python adapter.py \
  --spi /dev/spidev0.0 \
  --gpiochip /dev/gpiochip0 \
  --gpio-int-line 25 \
  -w old.raw new.raw
exi_id: 00000010
  card size:        2097152
  turnaround bytes: 4
  sector size:      8192
  sector count:     256.0
status: CardStatus.UNLOCKED|INT_EN|READY
interrupts already enabled
[snip]
serial:    [snip]
time:      [snip]
bias:      [snip]
lang:      0
device ID: 0
size:      16 Mb
encoding:  1
100%|████████████████████| 256/256 [00:00<00:00, 607.44it/s]
updated 1 blocks

Some disclaimers:

I have only tested this with 3 official cards. I had purchased a non-official card, which I somehow unfortunately bricked early in this development effort (may have received 5V on the wrong pin, not sure).
erase + write support is working, but I have not exercised it much.
I have not used some methods at all, like write_buffer and erase_card. I do not understand what some of these are even supposed to do (write_buffer, get_id, clear_status, wake_up).

vpelletier commented 2 years ago

Updated output sample:

$ ./vpy3/bin/python adapter.py --spi /dev/spidev0.0 --gpiochip /dev/gpiochip0 --gpio-int-line 25
card size (B):    2097152
turnaround bytes: 4
sector size:      8192
sector count:     256.0
flash id:         418330934a454f521fca1c75
id:               c221
status:           CardStatus.BUSY|UNLOCKED|INT_EN|READY
header:
  serial:    e6632c9ff127fc94c965844d (decoded: 418330934a454f521fca1c75)
  time:      000cd8bdc9721dd4
  bias:      49302254
  lang:      0
  device ID: 0
  size:      16 Mb
  encoding:  1
header serial is consistent with card id
header checksum consistent

I believe I am mostly done with this rework (it achieves what I need and the code, if never perfect, seems clean-enough). Review & testing welcome.

RSDuck commented 1 year ago

thank you so much for bringing the unlock sequence into this much more readable form.

I was wondering about the LFSR, as after the initialisation it could be possible that there's a collision between the lowest bit of the seed and the next new value.

Due to the seed always being bitrev(0x7FEC8|something) the lowest bit will always be 0, so it's not an issue in practice.

Thinking about it a bit more, if I'm not at err, I'm wondering whether it's not a 31-bit LFSR and not a 32-bit as the next value is effectively inserted into the second bit?

vpelletier commented 1 year ago

I was wondering about the LFSR, as after the initialisation it could be possible that there's a collision between the lowest bit of the seed and the next new value.

Indeed, and this is arguably a bug in my implementation if this class is taken out of context.

Due to the seed always being bitrev(0x7FEC8|something) the lowest bit will always be 0, so it's not an issue in practice.

IIRC I took this value from libogc's source, which uses a lot of statements to produce this value with a super-tiny amount of entropy (less than 8 bits IIRC). I believe the randomness is useless in such implementation, as this code is not trying to confuse a data sniffer (while I guess the GC's firmware is trying to do, and by extension libogc). So I just picked one of the values libogc can produce and ran with it.

Also, it is not just the last bit which is always initialised to zero, but the 12 last bits.

My guess is that the card hard the very same LFSR and loads in it the address of the first readpage instruction received, and this value is only 19bits-long. The value sent to the card would be loaded in the LFSR "away" from the end the new bit gets shifted in (plus an implied 20th bit set to zero where the LFSR output bit is), so there are always 12 zeroes next to it. I guess this is why several bytes are read and discarded in the handshake, each byte shifting the LSB some more and getting rid of that sequence of zeroes in the keystream.

So setting any of the 12 LSbs will desynchronise the ciphers, breaking the handshake, as the card has no way of knowing those bits.

Thinking about it a bit more, if I'm not at err, I'm wondering whether it's not a 31-bit LFSR and not a 32-bit as the next value is effectively inserted into the second bit?

This is correct: the new bit is OR'ed on bit 0 and immediately shifted left to be ready for the next round.

It may be easier to think about it in hardware-description terms: this is a 31bits register (one of which is the output) fed from one combinatorial bit. The production of the input bit is "permanent" and it only gets fixed when shifted in the first register bit. In this implementation I picked a 32bits value as it seems less surprising as a software implementation.

BTW, my code is not efficient: it reads only the MSb to produce the key stream. It could instead read the entire state of the LFSR right after producing the 32nd bit to get 32bits worth of key stream. It could then do 32 rounds to get the next 32 bits only taking care of the taps, then reading the entire state, and so on. I guess the current implementation is a bit more readable (it uses a more familiar LFSR access pattern and does not need extra code to handle less-than-32bits reads), and this is not performance-critical: this is only needed to unlock the card, and then the slow part is actually reading/writing the content of the card.

DeadlySurgeon commented 8 months ago

@vpelletier with the 3D model you included, what parts end up filling it out?

vpelletier commented 8 months ago

I use simple wires, stripped so they make contact with the memory card's connector, and with a piece of tape on the other side so the wires cannot touch the card's shield. I'm not sure I can recommend multi-strand wires, as I used in the picture below... Beware of loose strands shorting stuff. I almost wired the minimum number of pins, the red wire is not connected anywhere. It's fiddly to put together, it does not always work, but I think it beats having to solder to the card's pads, and does not require salvaging a connector from a console.

Note: the pictured model was an earlier iteration, with wire guides coming through the connector at 45°. It did not print cleanly and I had to use a drill bit to open the holes and get the wires through. The 90° angles of the included model should print cleaner. Also, I used Kapton as tape, but any thin tape should work - just make sure the wires do not punch through.

DeadlySurgeon commented 8 months ago

Ended up buying a broken GC, and pulled this off. Let me tell you, it was more difficult than I thought it would be, and I destroyed the first port. As much as I love to use this with a full on raspberry pi, would it be doable to port some of this over to a rp2040?

vpelletier commented 8 months ago

I have never used an rp2040, so I have no idea. All I can say is that the card unlocking handshake does not rely on fancy operations (no >32bits values, simple arithmetic & logic). The card header decryption is a different beast, my code mostly deals with it only as a sanity check for the unlocking handshake.

DeadlySurgeon commented 8 months ago

@vpelletier I didn't notice any header decryption needed after the dump is formed, what part are you talking about?

vpelletier commented 8 months ago

That was about the GCMHeader class, which is not involved in dumping the data itself but is involved in producing the printed output.

vpelletier commented 8 months ago

WARNING: do not use on unofficial cards

So far, 3 unofficial cards are known to have been somehow bricked while using this code:

mine, as written in an earlier post on this merge request
two of @DeadlySurgeon , as reported on #3

Unofficial memory cards report themselves as unlocked, so the unlocking handshake should not even run, and should probably not be related to this issue. The rest of the code comes from my understanding of libogc2, which is AFAIK used in homebrew to access memory cards on-console. Still, something must be wrong there.

There is of course the possibility that something else happened (I initially suspected a bad wiring in my case, which could have sent 5V to the 3.3V logic), but at three bricked cards I think this is looking quite bad for my code.

So it looks like some work reversing unofficial cards is needed. Hopefully there are not many variants. Mine is apparently (I threw away the packaging long ago) a ‎Mcbazel ‎MT-000002, which is a "256MB" (that's 256Mb in proper non-scoundrel units, which is truly 32MB) card, with a button switching between two internal banks. It has some extra pads internally, probably for in-system programming. The flash chip has still-visible markings, but the main chip (some MCU, certainly) has been sanded flat.

DeadlySurgeon commented 8 months ago

For documentation sake, the memory cards I was using were/are Mcbazel 1024MB(16344 Blocks) Memory Card (US Amazon Link).

jamchamb / gc-memcard-adapter

Major rework (WARNING: do not use on unofficial cards) #5