emukidid / cleanrip

Disc Backup Tool for GC/Wii Discs
GNU General Public License v2.0
300 stars 35 forks source link

Fix crash on verify disk and other minor fixes #63

Closed ac90b671 closed 4 years ago

ac90b671 commented 4 years ago

This pull request has 3 unrelated changes:

I'm more than happy to break the changesets up, but if you're happy with taking the changeset wholesale and don't care about the mixed concepts, it's easier for me this way.

The verify_findMD5Sum(...) fix: The crash was happening inside the call to mxmlIndexNew(...). From looking at the source code here, I noticed the index generation calls realloc every 64 entries. At best this is slow. I suspect the crash is either from too many or to large of a realloc call that the wii can't handle. I didn't look into it too much further since it was easy enough to not make the index. Conceptually, the index does a linear search over the xml string, doing a ton of reallocs along the way. Then we use the index to do exactly 1 linear search over the index. In this case, the index isn't really helping at all.

There's a couple ways to fix this. The solution I'm proposing is fairly simple and clean, but there were a few concessions I had to make. The change I made was to make mxml do a linear search for the key-value pair of the md5 value itself. Doing it this way should be O(N), same as the old code, but doesn't heap allocate at all and does one O(N) pass rather than 2 O(N) passes, so it should, in practice, be much faster.

Because mxml doesn't have an interface for case insensitive searching, I'm changing the md5 and sha1 values cleanrip generates to be lower-case to match what is present in the dat files. This is a behavior change. The old code did a case insensitive comparison. The new code I'm proposing restricts the input to requiring lower case md5 values. As far as I can tell, this should be fine, but if we want to continue to support upper case md5 values in dat files, we could do 2 passes, 1 for lower case and 1 for upper case. If we want to continue to support mixed case md5 values, I could refactor it so we iterate the xml document without an index and use case insensitive comparison, but that would be slower than what I'm proposing. Another behavior change from generating lower case md5/sha1 values is that the file written to disk is now lowercase as well. If it's important to match upper case file writing and all the other concessions are acceptable, we could do an upper case pass in the file write.

There were a couple other changes that weren't causing crashes, but I think are good changes. I noticed in verify_findMD5Sum we were calling mxmlLoadString to load an xml object even though we already have one staticly loaded. I removed the call, and just use the static one. I also noticed we are loading the xml documents with MXML_TEXT_CALLBACK. While reading some mxml documentation here, I noticed this note:

Note: The most common programming error when using the Mini-XML library is to load an XML file using the MXML_TEXT_CALLBACK, which returns inline text as a series of whitespace-delimited words, instead of using the MXML_OPAQUE_CALLBACK which returns the inline text as a single string (including whitespace).

If I'm understanding it correctly, using MXML_OPAQUE_CALLBACK will give us a slight speed up because we have less characters to parse. I tried switching the wii and gc dat files to load with MXML_OPAQUE_CALLBACK and it seems to work fine.

Minor fixes when running on gamecube with a gc.dat file present but no wii.dat file: So to explain the issue further, I noticed when I was using CleanRip on my gamecube, the UI would say "redump.org DAT files not found" even though I had a gc.dat file on my SD card. The UI would look correct later while ripping the disk and it would use it properly.

When I looked into this issue, I discovered that the gamecube executable was trying to also load the wii.dat file, and verify_initialized only reports success if both files are present. Since the gamecube has no need to load the wii.dat file, I refactored the loading, downloading and verify_initialized setting to be #ifdefed around HW_RVL so that we don't waste any time on the wii.dat file. This also fixes the UI so the gamecube exe reports "redump.org DAT files found" so long as the gc.dat file is present.

Various minor fixes to get the project compiling warning free This part of the changeset is just to make the project build warning free. Most of these fixes are making the printf-like function types match the passed in types. There were some slightly more interesting warnings where a char array was used as input and output to sprintf. This is technically illegal since sprintf is using the restrict keyword. I doubt it was causing any real issues since all the callsites where repetitively writing the same values, so it's not important if we happened to read a cached off value, but the repetitive coping behavior isn't needed so I just refactored the callsites doing this to append with offset calculations rather than copying repetitively. It's a theoretical speedup, but no doubt unmeasurably so.

What I have tested (all with verification on):

What I have not tested:

From my testing, I'm pretty certain the crash on disk verify is fixed and the gamecube UI reporting no dat files when there is a gc.dat file is fixed. I also think this code speeds up the disk verify process a lot. I think most of the cost when it was working was the cost of the realloc calls inside mxmlIndexNew. It may be worth considering removing the prompt whether you want to do the checksum verification, now that it's faster.

ac90b671 commented 4 years ago

As an update, I did some extended testing that all checks out. I tested:

They all ripped successfully with no crashes and passed verification. I double checked iso checksums on PC too.

ac90b671 commented 4 years ago

I went ahead and tested ripping a Wii game on a Wii U too. It works fine. I updated the original post tests too.

Bombastisch commented 4 years ago

Does this fix the bug that CleanRip only reads the wii.dat and gc.dat from where it will dump to? It always downloads the files again everytime I use another memory stick. Maybe it can read it from where CleanRip gets loaded from (SD)?

ac90b671 commented 4 years ago

This pull request doesn't address that issue. It sounds like it'd be fairly easy to implement though.