putnam / binmerge

Tool to merge multiple bin/cue tracks into one. Great for redump.
GNU General Public License v2.0
340 stars 44 forks source link

Request : handle pregap #23

Open KanedaFr opened 2 months ago

KanedaFr commented 2 months ago

Actually a lot of available images include the gap between 2 tracks as sub track. Mainly used to be able to rip hidden audio track, most of time it means 150 empty frames (2s) or more so at least 345Kb per track used for nothing.

To reduce binary size, is it possible to add an option to remove these zeros, defined by

INDEX 0 00:00:00
INDEX 1 00:02:00

and replace them with

PREGAP 00:02:00
INDEX 1 00:00:00

The merged binary will be at least 345Kb*(nbTracks-1) smaller.

Of course, keep it as an option because if wrongly used, hidden audio tracks could be removed. So perhaps check if the gap is really filled by 0 before proceeding ?

similary, on binsplit, the option could be used to add empty frames on binary track or not.

Thanks ;)

putnam commented 2 months ago

I like this idea. One problem is with binsplit you don't know if the original had a pregap defined or if it used hidden audio tracks. You would need to mark up the generated cuesheet in a special way to be able to reverse it. Is it worth this extra special-casing to save the space? What is the most space lost to hidden audio tracks? Can you share some examples with redump links?

KanedaFr commented 2 months ago

if index 0 is used for hidden audio, there is no way to reverse it PREGAP means XX:XX:XX frames of silence, which is similar to a subtrack filled with zero. if you replace a subtrack 0 with PREGAP, whatever was on the subtrack is lost. It's why I also suggested you could secure by testing if the data to remove is zero-filled or not, perhaps..

hydrogenaud has an exemple of an audio CD cuesheet with hidden tracks (see part A single-file cue sheet with a TRACK 01 INDEX 00 hidden track)

about redump images, look at any of the Mega/SegaCD dumps...
I'm currently 'playing' with Urusei demo and WonderMega collection with this last one using 35 tracks ! but, for now, I didn't find hidden tracks.

putnam commented 2 months ago

Right, I know that the data would be lost if you use PREGAP to replace the track in the merged cuesheet, and I get that you're saying you'd only do this to fully zeroed out tracks.

What I'm trying to think through: let's say you have a source dump, made up of multiple files, and maybe some of those are 2-second gap tracks. But it also defines PREGAP in the cue sheet somewhere one or more times. I'm not sure if this is something that happens but it's possible, right?

So if that happens, and you remove some gap tracks and replace them with PREGAP, now it is impossible to know during the binsplit scenario which PREGAP lines were present in the original image and which ones were inserted as replacements for spurious tracks.

I think probably the most important question before continuing here is exactly how much space is saved in the max and average use cases. It does add complexity. If it's a pretty significant space waste then it might be good to add as an optional flag.

cgarz commented 2 months ago

Is it always just null bytes? If so, wouldn't even basic transparent filesystem compression dramatically reduce their actual space usage on a filesystem anyway?

For example its common that wii games have their update partitions removed by replacing them with null bytes to save space. Even though the disc image size remains the same, when even lightly compressed they reduce in size dramatically.

KanedaFr commented 2 months ago

How much space ?

Can I remove a not gap track by error ?

Is it always just null bytes?

putnam commented 2 months ago

OK, you said:

at least 345Kb per track used for nothing

Implying that it could be multiple tracks that could be replaced with PREGAP.

I asked twice now what the maximum amount saved would be, and the average. If it is 345KB for a 400-600MB disc image that is not a useful savings. If you said, hey, 50% of games in the Saturn library will save 10MB or more, then it is compelling. Saving a few hundred KB or even a few MB here and there in 2024 when you can store entire systems' libraries on a cheap USB stick for ODEs is just not really a good use of resources, especially when we have to account for the reversal via binsplit.

I'm not opposed to it, I just want to know the impact.