vpelletier / cartreader

A shield for the Arduino Mega that can back up video game cartridges.
GNU General Public License v3.0
0 stars 0 forks source link

[N64] Eeprom reading/verifying broken #4

Closed sanni closed 2 years ago

sanni commented 2 years ago

Tested with Donkey Kong 64, 2KB eeprom. Writing with 10.4 trim works, tested by reading back with old V10.4 before trim. But reading/verifying does read completely different data with 10.4 trim.

Writing...
//TEST/write_104_trim.eep
Done

Verifying against //TEST/write_104_trim.eep
Error: 1538 bytes 
did not verify
Press Button...

write_104_trim.zip read_104_trim.zip

vpelletier commented 2 years ago

Should be fixed by a89efdc803c962cb6a5a7b4b8de8273dfa07df23 (tested with Mario Tennis, also 2KB eeprom).

sanni commented 2 years ago

Hangs indefinitely on "Reading Eeprom..." now.

[+] Nintendo 64 (3V)
[+] Game Cartridge
Searching database...
Title: DONKEY KONG 64
Serial: NDOJ
Revision: 0
ROM Size: 32 MB
Save Type: 16K EEPROM
CRC1: 053C89A7
Press Button...
[+] Read Save
Reading EEPROM...
sanni commented 2 years ago

Could be the timing of the long pulse, 2.66us might be too short compared to the 3us needed.

wgggagagaga

eeprom_read

vpelletier commented 2 years ago

Then I guess there are more timings which will need to be fixed, but this will become visible when writing the eeprom and not when reading. On my end, Mario Tennis still reads fine.

In any case, I increased the stop bit delay by (3us - 2.66us)/62.5ns = 5.44 NOPs, rounded to 5: b70df256beeb062dc5ef5bea89896d1160669140

Could you check more timings ?

BTW, isn't this the same protocol as with the controller ? I realise now that the functions for both are separate. Looks like there is more program space to be saved, but now I am worried about the timings.

[EDIT] units

sanni commented 2 years ago

It now works more often than not but still hangs sometimes. Attached is a logic capture of a complete read and write

You need this program(linux, osx, windows supported): https://support.saleae.com/logic-software/legacy-software/older-software-releases#1.2.18-download

read_eeprom_4k.zip write_verify_eeprom_4k.zip

vpelletier commented 2 years ago

I increased the stop bit delay

Reading the traces, I have no idea why I thought the problematic delay was in the stop bit. The issue is the long delay of regular bits, right ?

Actually, I think now that N64_EEPROM_STOP_BIT_HIGH_DELAY should be unnecessary: after a stop bit the CPU switch to listening mode, where it waits for the data line to go low without timing anything.

Anyway, thanks a lot for the traces, this is super useful. I have tweaked the delay macros and added comments about how the number of NOP was chosen: fbfae3524960dc4e144743d36665362f7996a33b (sadly no time to test this right now, will test this evening) .

There is still something I do not quite get about this protocol: how does the device even detects a host NOP ? Looking at the traces, there is no pause between bit nor bytes read which would leave time for the host to pull the data line low. If it does so while the device sends a 1, it would be invisible on the waveform (barring small timing mismatches, much shorter than the protocol delays). Is the pause present in read commands only after 8 bytes sent, and hence invisible in the trace (because the stop bit did happen) ?

sanni commented 2 years ago

Here are some captures from a real console: read_console.zip write_console.zip

You need Logic2 to open them, they changed the format from Logic1: https://www.saleae.com/downloads/

sanni commented 2 years ago

And here the "fixup! fixup!" code when it locks up while reading(Reading EEPROM... displayed on screen forever):

read_lockup.zip

I think now the 1us high pulse is too long with 1.21us. If you add the Gamecube Analyzer (which isn't a perfect fit for N64 EEPROM but it gives an idea about timing) you'll see that the wave is getting shifted away from the middle dot when it's not perfectly 3us/1us

"fixup! fixup!" esgwtgsegsgs

real console geadgasgagaag

vpelletier commented 2 years ago

Actually, I think now that N64_EEPROM_STOP_BIT_HIGH_DELAY should be unnecessary: after a stop bit the CPU switch to listening mode, where it waits for the data line to go low without timing anything.

This turns out to be wrong. Not sure why (the next to talk should either be the host starting a new transaction after a massive 600us delay, or the device to which these NOPs with floating data line make zero difference), but I still cannot wrap my head around how the stop bit works in this protocol anyway.

Another pair of notes about the 600us delay:

Anyway, I readded it, which un-broke my ability to dump mario tennis. Then I dropped one NOP from the short data bit delay. here is the commit: c3e5e97b65101943f473f36e6b5448ee0fd6dc8f .

On a tangentially related note: I am now getting a lot of timeouts while verifying the firmware of the control board... I've tried with CLK1 on and off, and it makes no difference. I've tried a different, supposed-good USB cable. There are no obvious kernel warnings. Is my board dying already ?

vpelletier commented 2 years ago

Anyway, I readded it, which un-broke my ability to dump mario tennis.

Checking more gamecarts, I found 21 which seem to be more sensitive than the rest: Mario Party 3 and Perfect Dark. TheseIt currently fails, so there is more to fix.

vpelletier commented 2 years ago

And one more fixup: fc8db21fd36fd1d1c99b15c3b19da4556fdf5ea2 . Now I cannot seem to get a dump to fail: I tested all my eeprom carts (7) dumping each at least 5 times in a row.

vpelletier commented 2 years ago

BTW, isn't this the same protocol as with the controller ? I realise now that the functions for both are separate. Looks like there is more program space to be saved, but now I am worried about the timings.

I finally snapped out of my misunderstanding: the controller has only power and one data line, whereas the EEPROM also has access to the clock line.

So it is the same protocol, but in one case (controller) the device is using the clock mixed into the data line, while in the other the device require a separate clock to be produced.

I'm still hoping I can get both implementations to converge, and I have the controller version cycle-perfect (...from what I can count, from successful use including reading and writing an MPK), but I still fail at reading the EEPROM with CLK1 disabled. I have not even tried to write the EEPROM. I have not tried writing to the EEPROM with CLK1 disabled. With CLK1 enabled, everything is working fine.

sanni commented 2 years ago

I don't think the clock is mixed into the data line for controller, the controller has its own 2MHz crystal on the PCB and uses that.

I have no issue with just dropping the CLK1 disabled eeprom functions completely since on HW5 you can't even disable CLK1 and for HW3, people can just buy the clock generator if they don't have it already.

vpelletier commented 2 years ago

I have no issue with just dropping the CLK1 disabled eeprom functions completely

This simplifies my task a lot, thanks.

I just pushed the N64 patch stack to the same trim branch. This version requires clockgen to dump & write eeproms, and brings large program & global ram spaces saving. Sadly, this is still not enough to fit everything: 646 bytes to go on HW3 (with the 3 display driver features re-enabled).

sanni commented 2 years ago

I tested the "Factorise JoyBus" commit in the Trim branch. Controller Test: works Controller Pak read: works Controller Pak write: works Eeprom read: works Eeprom write: many errors

OSCR HW5 V11.0
[+] Nintendo 64 (3V)
[+] Game Cartridge
Searching database...

Title: DONKEY KONG 64
Serial: NDOJ
Revision: 0
ROM Size: 32 MB
Save Type: 16K EEPROM
CRC1: 053C89A7
Press Button...

[+] Read Save
Reading EEPROM...
Saved to N64/SAVE/DONKEY KONG 64/609/
Press Button...

[+] Write Save
Select eep file
[+] /TEST
Select eep file
[+] 64_1024.fla

Writing...
//TEST/64_1024.fla
Done
Verifying against //TEST/64_1024.fla
Error: 1521 bytes 
did not verify
Press Button...

I have first used V10.2 to write a random pattern to the eeprom. Then read it back successfully with the "Factorise JoyBus" commit. Then wrote a file filled with 0x64 to it which failed. Attached is the read-back after the write and the logic captures of all tests and a compiled N64Analyzer.dll for Logic 1.2.18 from this source code. Also attached is a write to the eeprom with V10.2 as comparison.

controllertest.zip cpak.zip eeprom.zip N64Analyzer.zip 10.2_write_0x64_to_eeprom.7z.zip

Untitled

The 5.41us instead of 3us is too much I think and i think it needs longer pauses between pages when writing. Your new code has 2ms, my old code has 50ms.

Untitled

vpelletier commented 2 years ago

The 5.41us instead of 3us is too much I think and i think it needs longer pauses between pages when writing. Your new code has 2ms, my old code has 50ms.

Wow, what's going on there ?

For the former I guess I have to re-count all cycles. I would expect controller stuff to break with such bad timings. I am very tempted to just write the whole function in assembly, to make sure it does not move.

For the latter, I am even more puzzled. Isn't delay(50) supposed to last about 50ms ? I have not checked its implementation, maybe it is being disturbed by the interrupts being disabled ?

EDIT: looks like fixing the latter should make things look like they are working:

$ hd DONKEY\ KONG\ 64_write_error.eep 
00000000  64 64 64 64 64 64 64 64  d2 63 f5 c9 1a 09 d4 14  |dddddddd.c......|
00000010  c3 15 93 0a 60 4f 36 4f  64 64 64 64 64 64 64 64  |....`O6Odddddddd|
[...]

Roughly a quarter of the eeprom contains 0x64, so I guess this is mostly down to the bad inter-page delay. But I would very much prefer to have all timings correct.

sanni commented 2 years ago

I think delay() doesn't work when interrupts are disabled.

vpelletier commented 2 years ago

I am very tempted to just write the whole function in assembly, to make sure it does not move.

I gave in to the temptation. Hopefully the ASM is commented enough to be understandable (...making the source very verbose in the process).

I found at least part of the bug: I miscounted the cycles in the delay loop I added: it was about 50% longer than intended (!) as I counted a conditional jump a 1 cycle instead of the actual 2 when the loop continues. And I think I made an off-by-one mistake on the number of loops.

This, plus the broken delay because of disabled interrupts, should explain the sorry state the code was in.

I pushed the updated version, which now also drops the need to pack received bit stream into bytes: there is enough time to do this while waiting for the next bit.

I've also added a few new non-N64 cleanups, and overall the current trim branch saves 4462 bytes of code and 396 bytes of global ram space on HW3 with all modules enabled (and oled driver with default options):

Sketch uses 243172 bytes (95%) of program storage space. Maximum is 253952 bytes.
Global variables use 5914 bytes (72%) of dynamic memory, leaving 2278 bytes for local variables. Maximum is 8192 bytes.
sanni commented 2 years ago

This works beautifully. 👍

write_eep.zip

vpelletier commented 2 years ago

Ah, this is what I wanted to see :) . Thanks. I opened a merge request with this code, modulo a typo fix in a commit message.

vpelletier commented 2 years ago

...and you already merged it. That was fast :) .