whipper-team / whipper

Python CD-DA ripper preferring accuracy over speed
GNU General Public License v3.0
1.15k stars 91 forks source link

possible (new) method of doing audio cd quality error rate analysis? #534

Open walterav1984 opened 3 years ago

walterav1984 commented 3 years ago

Stumbled upon a twitter post by Hector Martin @marcan quoted: https://twitter.com/marcan42/status/1381544164832108545

... There's no way to get raw error counts (pre ECC) off of most CD-ROM drives, but the subchannel has no ECC, so that works as a proxy for quality...

He uses subchannel "without ecc" data to be able to do audio CD error rate analysis by comparing it with normal ecc corrected data: https://github.com/marcan/cd-analysis

Not sure if this insight is new, or even usefull for this whipper project CDR/CD. I only found reference in libcdio changelog 1.0 of dropping subchannel support.

Feel free to use/close this issue.

github-actions[bot] commented 3 years ago

👋 Thanks for opening your first issue here! If you're reporting a 🐞 bug, please make sure you include steps to reproduce it. We get a lot of issues on this repo, so please be patient and we will get back to you as soon as we can.

To help make it easier for us to investigate your issue, please follow the contributing instructions.

marcan commented 3 years ago

It works well, for what it's worth. Still need to run the proper numbers, but here's 93 CD-Rs from 10 different brands visualized. cdrs2

That said, this is mostly useful to figure out how close to corrupting audio data a CD that otherwise rips fine is. I think there are simpler ways of finding out if the audio data itself is borked (and a good subchannel is not a hard guarantee that the audio part isn't bad.

Relevant: at least on my drive, sometimes the subcode "slips" and ends up offset vs the audio data. This seems to usually be a -3 sector offset, sometimes more in rough spots, and not consistent rip to rip. I currently compensate for this once I get a valid subcode CRC with a timestamp mismatch, and backtrack in case any immediately previous subcode sectors were bad to compensate (otherwise the skewed subcode would be recognized as lots of errors for those sectors).

Note that this method relies on knowing in advance what the subcode is supposed to be. I can do this for my CDs because I'm burning them with no fancy anything, the subcode is straight mode 1 timestamps and that's it. If you don't know what the subcode is at all, the best you can do is flag whole sectors as good/bad (which is still pretty useful info for low-level errors). Alternatively, you could add a heuristic that attempts to derive subcode info (e.g. you can probably interpolate a broken sector between two good ones if the track/index does not change, unless other modes are in use; more heuristics could help recover those).

JoeLametta commented 3 years ago

Hi, currently whipper prints messages like these:

Track 1 finished, found 122 Q sub-channels with CRC errors
Track 2 finished, found 65 Q sub-channels with CRC errors
Track 3 finished, found 51 Q sub-channels with CRC errors
Track 4 finished, found 257 Q sub-channels with CRC errors
Track 5 finished, found 78 Q sub-channels with CRC errors

The information is taken from cdrdao and can be used as a rough estimation of the CD's "health". I'm not sure we can use a similar technique for more than providing informative messages...