superg / redumper

Low level CD dumper utility
GNU General Public License v3.0
216 stars 19 forks source link

Track 00 not written on split after large shift from lead-out #133

Closed hcs64 closed 3 months ago

hcs64 commented 6 months ago

Audio CD split issue

redumper v2024.01.08 build_311

non-zero  TOC sample range: [   -88200 .. +112063980]
non-zero data sample range: [    +7003 .. +112092421]
Universal Hash (SHA-1): 0551cdfb9f5cf8f4cab9ed37665b3040900aca70

moving data out of lead-out (difference: +28441)
disc write offset: +28441

It's moving way more data out of lead-out than there's zero data to shift out at the start, but it's not producing a corresponding Track 00. Track 01 is audibly cut off.

Full log attached CCD 10493.orig.log, let me know if more data would be helpful.

Doesn't seem to be a recent regression, I tried splitting with redumper back to v2023.07.19 build_195. I'm pretty sure I've seen Track 00 written in cases like this before (positive non-zero data at start but shifting out of lead-out moves data into lead-in), but maybe I'm misremembering.

superg commented 6 months ago

Huh, this is interesting, will definitely check that out, log seems normal to me.

hcs64 commented 5 months ago

Here's another disc with an issue that seems related:

non-zero  TOC sample range: [   -88200 .. +32205936]
non-zero data sample range: [   -53327 .. +32214875]
Universal Hash (SHA-1): fe3c352fbf2a684b2aa85579b9802326b368d2a3

Perfect Audio Offset (silence level: 6): [-53308 .. -53250]
moving data out of lead-out (difference: +8939)
disc write offset: +8939

Track 00 is output, but it is cut off. If I use --force-offset=0 it outputs both Track 00 and Track AA and Track 00 correctly starts with some zero data. Attached is a log showing the default and forced split (edit2: more complete log): Glodia1.log

With --force-offset=0, Track 00 begins with 8628 zero samples (size="214032" crc="a47cfc1a"). With the default offset +8939, Track 00 skips 8939 samples (0x8BAC bytes), so 311 non-zero samples are lost (size="214032" crc="b24a554f"). This also happens if I explicitly use --force-offset=+8939, fwiw.

Deterous commented 3 months ago

@superg We think this issue may be caused by the leadin_start calculation (when checking for data in the pregap) not accounting for the offset, what are your thoughts? I may have this quite wrong, but the changes suggested are: https://github.com/Deterous/redumper/commit/f0ede2bf5164b130d8070392b4c1dcd97c7ac4c7

i.e. turning https://github.com/superg/redumper/blob/f52fd2e5617e27f281c2ae23f8e8cd14dc49f2a7/cd/split.ixx#L1290 into something like

int32_t leadin_start = i ? toc.sessions[i - 1].tracks.back().lba_end : sample_to_lba(nonzero_data_range.first, offset_manager->getOffset(sample_to_lba(nonzero_data_range.first)));
hcs64 commented 3 months ago

I tried a build from Deterous's suggestion, the first disc (CCD 10493) now produces a Track 00 with the missing data:

warning: lead-in contains non-zero data (session: 1, sectors: 37/37)

and the second disc (Glodia1) now produces a Track 00 with all the data (was only 91 sectors before):

warning: lead-in contains non-zero data (session: 1, sectors: 106/106)
Deterous commented 3 months ago

Currently redumper prefers to move data out of the lead-out and determine write offset based on that. If there's more than one sector of data in the lead-out though, is it still okay to surmise that it is the write offset? In some scenarios (especially with data in the leadin too?), would it not make sense to just fallback to a write offset of 0? A similar scenario appears with #134, a write offset of +28648. Not sure if this is a technical or philosophical question. Would like your opinion.

superg commented 3 months ago

@Deterous The code change looks ok to me, I missed updating this line after transitioning to OffsetManager. Please create a PR and I will approve it. Eventually the split code will be rewritten to simplify everything and remove legacy things, I just don't have time right now.

Currently redumper prefers to move data out of the lead-out and determine write offset based on that. If there's more than one sector of data in the lead-out though, is it still okay to surmise that it is the write offset? In some scenarios (especially with data in the leadin too?), would it not make sense to just fallback to a write offset of 0? A similar scenario appears with https://github.com/superg/redumper/issues/134, a write offset of +28648.

The assumption is that disc write offset can go +- 2 seconds which is 150 sectors which is 88200 samples. We do have discs with huge write offset that is not an outcome of the shift from leadin/leadout. I think there was a way to search/sort by these in the DB. The 2 seconds limit is also suggested by the 2 seconds CD pregap size which accounts for the same problem.

The primary reason for moving data out of leadout is not the offset detection. Standard says that there is no data in the leadout but at the same time it doesn't say the same about leadin. The logic is that it's more unlikely to have data in the leadout if shifting it out will still keep data in [pregap .. leadout) range. Hope that explains it.

superg commented 3 months ago

And to clarify it further, Track A will only be created if non zero data span is already larger than [pregap .. leadout) range e.g. it already doesn't comply with the standard.

Deterous commented 3 months ago

Thank you for the explanation. I will check the changes and test them with some discs, then send the PR when I am happy with it.

bikerspade commented 3 months ago

https://github.com/superg/redumper/issues/134 is still unresolved with this latest fix. All of the track splits are clearly incorrect using the write offset it detects (and moving the beginning part of track 01 to track 00), but not when forcing the offset to 0. Warren G - Regulate... G Funk Era (USA).log

Deterous commented 3 months ago

In terms of the redump database, the largest positive offsets I can see for discs with data tracks are +5292 (single track PC, http://redump.org/disc/82596), and +6722 (mixed mode Mega CD, http://redump.org/disc/1838). There are also a couple of questionable CD-i titles that probably need rechecking, with offsets very near 88200 (e.g. http://redump.org/disc/79065).

There are only a handful of Audio CDs in redump with positive write offsets in the tens of thousands like the discs above. I don't think their "true" write offset is actually that large. After splitting hcs's disc, the first vocal is "cut-off" from the beginning of Track 1. While this might be the "correct" way to split a disc according to the red book standard (no data in leadout), it results in a split that does not pass the sniff test, sounding weird when played and not acting like the CD would if played in a CD player.

I'm still happy to add the changes to redumper as above with a PR, this discussion is bleeding into redump.org procedure rather than how redumper should act.

superg commented 3 months ago

There are also a couple of questionable CD-i titles that probably need rechecking, with offsets very near 88200 (e.g. http://redump.org/disc/79065).

Those are the offsets I was referring to. CDI was one of the early masterings and there are a lot of things like this. I'm pretty sure the offset is correct.

There are only a handful of Audio CDs in redump with positive write offsets in the tens of thousands like the discs above. I don't think their "true" write offset is actually that large.

This is not true audio offset, let's not call it true because it never was. Look at it from the other perspective: this is the offset that allows us not to lose data and be as close to the standard as we can.

Regarding this specific case, my initial idea was not to create Track 0 but have it prepended to Track 1. Pre-gap seem to always belong to Track 1 in the cases that I saw. But this idea was opposed by FB and Jackal I think. I suggest to check out my thread about audio offset at forum, I believe everything is explained there.

If biker says that the fix doesn't solve his case, it might be something else. I'm too distracted ATM to look into the split code right now, feel free to experiment with it.

hcs64 commented 3 months ago

Thanks, build 379 (with PR #168) looks good, I used it in a redump fix for my Glodia Vol.1 submission.