whipper-team / whipper

Python CD-DA ripper preferring accuracy over speed
GNU General Public License v3.0
1.15k stars 91 forks source link

CTDB as an alternative to AccurateRip #15

Open 45054 opened 8 years ago

45054 commented 8 years ago

It would be nice to be able to use http://db.cuetools.net/ as an open source/public domain alternative to AccurateRip.

JoeLametta commented 8 years ago

This is an interesting idea but it should be evaluated only after we have dropped GSt 0.10 in favour of GSt 1.x (#2) and made whipper query both the AccurateRip v1 and the AccurateRip v2 databases at the same time (#18).

RecursiveForest commented 8 years ago

Just to correct an alarming comment: we are never going to support gstreamer after we drop 0.10. See https://github.com/JoeLametta/whipper/issues/29 .

Regarding the actual enhancement, I am in favour of at least optionally supporting this.

JoeLametta commented 8 years ago

Just to correct an alarming comment: we are never going to support gstreamer after we drop 0.10. See #29 .

Right, useful clarification. That was an old reply from January and, in the meantime, lots has changed.

parkerlreed commented 6 years ago

Any news on this? Would be great to have alongside the Accurip V2 that was added back in September.

ghost commented 6 years ago

Would be indeed nice.

JoeLametta commented 6 years ago

@gchudov I'd like to ask you if the only existing implementation of a CTDB client is the one included in CUETools (written in C#).

Thanks, Joe

gchudov commented 6 years ago

As far as i know, yes. The network protocol part is quite trivial, but the parity calculations in https://github.com/gchudov/cuetools.net/tree/master/CUETools.Parity a bit more work. The database does have CRC32 track-based checksums too which are a lot easier to implement in any language, so a stripped down implementation that only supports verification using those and doesn't support database submission can be quite simple.

JoeLametta commented 6 years ago

@gchudov Understood, thanks for the explanation.

JoeLametta commented 5 years ago

@gchudov So, to the adopt the stripped down implementation of the database (verify only) in whipper, I need to:

  1. Compute the CTDBID of the disc
  2. Perform a HTTP request like this: http://db.cuetools.net/lookup2.php?version=3&ctdb=1&metadata=none&fuzzy=0&toc=TO_ID_HERE
  3. Check that the response has a HTTP 200 status code
  4. Parse the XML

Then I'm undecided how to continue...

BTW: How do you compute the values in the trackcrcs attribute?

Thanks, Joe

JuniorJPDJ commented 3 years ago

@gchudov I'd like to ask you if the only existing implementation of a CTDB client is the one included in CUETools (written in C#).

No, EAC also supports CTDB and by default it uploads data to it, which would also be cool to do.

Compute the CTDBID of the disc

You actually don't have to. You can just lookup DB using TOC: https://github.com/JuniorJPDJ/py-cuetools/blob/9aef262ea690b34b291484f4ab23cd5e38b7ac8e/cue_cdtoc2mbtoc.py#L150 https://github.com/JuniorJPDJ/py-cuetools/blob/9aef262ea690b34b291484f4ab23cd5e38b7ac8e/cue_cdtoc2mbtoc.py#L119

I'm interested in implementing those parity checks in python, maybe if I do this you guys could just copy-paste parts of my code into yours ;D

JoeLametta commented 3 years ago

No, EAC also supports CTDB and by default it uploads data to it, which would also be cool to do.

Well, the plugin EAC uses to interact with the CUETools DB is from @gchudov and written in C# too.

CUETools.AccurateRip.dll:    PE32 executable (DLL) (console) Intel 80386 Mono/.Net assembly, for MS Windows
CUETools.CDImage.dll:        PE32 executable (DLL) (console) Intel 80386 Mono/.Net assembly, for MS Windows
CUETools.CTDB.EACPlugin.dll: PE32 executable (DLL) (console) Intel 80386 Mono/.Net assembly, for MS Windows
CUETools.CTDB.dll:           PE32 executable (DLL) (console) Intel 80386 Mono/.Net assembly, for MS Windows
CUETools.Codecs.dll:         PE32 executable (DLL) (console) Intel 80386 Mono/.Net assembly, for MS Windows
CUETools.Parity.dll:         PE32 executable (DLL) (console) Intel 80386 Mono/.Net assembly, for MS Windows
...
TrID Generic .NET DLL/Assembly (89.5%)
PEiD packer .NET executable 

I'm interested in implementing those parity checks in python, maybe if I do this you guys could just copy-paste parts of my code into yours ;D

Cool, thanks! Please note that your code currently doesn't specify a license so it's implicitly licensed as all rights reserved. To be rightfully included into whipper, the foreign code should be licensed with a license compatible with GPLv3.

JuniorJPDJ commented 3 years ago

Well, the plugin EAC uses to interact with the CUETools DB is from @gchudov and written in C# too.

True.

@JoeLametta yup, I also thought about missing license. ATM treat it like it was LGPL licensed, I'll add license later.

gchudov commented 3 years ago

@JoeLametta I think per track summary is more useful than the entire rip, and it's not that much more work - the general idea is the same. In both cases you have to compare CRC32 and the tricky part in both cases is to account for pressing offsets (something that AccurateRip V2 doesn't support). The entire rip CRC32 is actually excluding the pre-gap (data before the first track), the first 10 sectors (5880 samples, similar to AccurateRip), and last 10 to 19.999 sectors (depending on disc length - total length of data covered by CRC32 has to be multiple of 10 sectors). Track CRC for the first track excludes the first 10 sectors, and track CRC for the last track excludes 10 to 19.999 sectors. Track CRC for intermediate tracks covers the whole track.

The trick to matching CRCs despite different pressing/drive offsets (e.g. when a small number of samples is cut from the beginning of the disc and there's the same number of extra samples at the end, and all contents is shifted that way) is to keep a running CRC while processing the disc, and remembering the first and the last 10*588 values at the beginning and end of each track. It's possible to then instantly compute offsetted CRC by doing some math on those values (Crc32.Combine). In a loop for a possible range of offsets you can recalculate offsetted CRCs for all tracks and compare to the database, and present the information for the offset that has the most matches.

JuniorJPDJ commented 3 years ago

@gchudov and what about parity? What algorithm it uses? Guessing this from code may be hard and I'm interested in this and would like to implement it. The best would be if I could use existing libraries and just do ctdb specific things here, but this would need me to know what kind of algorithms you are using.

bmwalters commented 2 years ago

WIP Python port of CTDB CRC generation: https://github.com/bmwalters/python-cuetoolsdb/blob/main/cuetoolsdb/verify_old.py

It passes the simple upstream unit test but that's all I've tested so far.

WIP CTDB HTTP API docs: https://github.com/bmwalters/python-cuetoolsdb/blob/main/api.org