[N64] Dumping ROM with fastcrc enabled takes 33% more time than before

vpelletier / cartreader

A shield for the Arduino Mega that can back up video game cartridges.

GNU General Public License v3.0

0 stars 0 forks source link

[N64] Dumping ROM with fastcrc enabled takes 33% more time than before #6

Closed sanni closed 1 year ago

sanni commented 1 year ago

fastcrc is enabled by default on HW4/5 but can also be enabled on HW3 in line 118.

10.4

Saving to N64/ROM/DONKEY KONG 64/405/...
[*******************]
CRC32... 919F7E74 -> Donkey Kong 64 (Japan).z64
Done (171s)

10.4 Trim

Saving to N64/ROM/DONKEY KONG 64/406/...
[*******************]
CRC32... 919F7E74 -> Donkey Kong 64 (Japan).z64
Done (228s)

vpelletier commented 1 year ago

This is likely caused by a3892f6d05fccc81605159d057caae00653b3a13: the compiler disregarded inline so there are 2 extra calls per iteration. I guess a better change would be to make these macros, then the compiler has to obey the inlining, and at least there is only one location where the CRC code is implemented.

vpelletier commented 1 year ago

I pushed a few fixes, bringing the time to dump an 8MB cart from 64s to 52s: da6923ea85e55ccdd5191d06523fd01fde635591, 1c303eefec0156b8572270f5e741b0b6d6173408 , da46363e1ee0e399bad20a99ed8e77369aabaed9

But this is still not back to what I get with my current upstream base (3b0a046d79e4b804d306fc774c796e48073299e8): 48s.

vpelletier commented 1 year ago

If I cheat I can shave off 2 seconds from that time by improving code I had not yet modified (get_line), but this does not address the 4s/10% loss - which I still fail to identify.

I guess the next best thing would be to bisect my changes, but this likely requires that I fold all the fixups into their original commit, rewriting history. Is it ok for you ?

sanni commented 1 year ago

Sure.

vpelletier commented 1 year ago

This has been a pain, but I found it:

updateCRC (before my changes) stores the computed table index into a 32bits variable masked to 8 bits, unlike the inline N64 version which stores it into an 8 bits directly.

vpelletier commented 1 year ago

Beware: the trim branch now does not contain most of my N64 work. So if you closed this issue by re-testing this branch, the issue would appear as fixed but for the wrong reason.

I am still trying to sort out my N64 changes into reasonable commits, as I have done a lot of changes, and the intermediate commits step on one another:

$ git diff --stat vpelletier/trim..trim N64.ino
[...]
 1 file changed, 285 insertions(+), 880 deletions(-)

sanni commented 1 year ago

It's even faster than originally now 🚀

Saving to N64/ROM/DONKEY KONG 64/617/...
[*******************]
CRC32... 919F7E74 -> Donkey Kong 64 (Japan).z64
Done (166s)

vpelletier commented 1 year ago

Awesome, thanks for rechecking.