thomasvs / morituri

For those about to RIP - a Unix CD ripper preferring accuracy over speed
GNU General Public License v3.0
301 stars 42 forks source link

AccurateRip v2 support #119

Open hcs64 opened 9 years ago

hcs64 commented 9 years ago

Many newer discs only have AccurateRip entries with v2 checksums, but morituri only computes the v1 checksums. Ideally it would compute both and count a match if either was in the DB.

As an example, "Distant Worlds III: more music from FINAL FANTASY" (http://www.accuraterip.com/accuraterip/a/a/b/dBAR-013-001dfbaa-012c1281-950f1f0d.bin) has only a v2 checksum in the DB.

MerlijnWajer commented 8 years ago

This tool (GPLv3) supports both versions. It also seems to be significantly faster than the code in morituri. Simple C binary: https://github.com/leo-bogert/accuraterip-checksum

MerlijnWajer commented 8 years ago

So I'm looking into integrating V2 in a way that we can keep V1 as well. Although I'm currently looking into using the tool that I linked above (for speeds reasons: it does most checksums in less than a second, that is including a flac decode:

flac -cd foo-tracknumber.flac | accuraterip-checksum --accuraterip-v1 /dev/stdin <tracknumber> <totaltracks>)

I will try to also use the python version as submitted by JoeLametta in a way that morituri will still use both versions and submit a PR for that. Shouldn't be too hard once the logic is in place anyway.

MerlijnWajer commented 8 years ago

I would like some feedback on how we should use both V1 and V2. There are several ways to do it.

One (lazy) way is to treat both of them as just a checksum, and then pick whatever one has more confidence, and write that to the log. Optionally we could record if we picked V1 or V2.

Another way would be to match both V1 and V2 separately, also adding extra variables to the trackresult classes, and also writing both of them to the log file. Basically, do what morituri does now, but just for both versions. My only problem with this approach is that if we ever get a V3 (there were some people who said V2 hardly fixed V1's problem), the code may get a bit ugly.

I'm fine with either approaches, I just need to know what most people (and Thomas) would prefer.

Alternative approaches are also fine.

JoeLametta commented 8 years ago

If there are both V1 and V2 checksums it's probably better to pick the V2 one. Of course the confidence level is important but maybe matching both the checksum is an overkill for a ripper: there are already tools designed exactly for this purpose (like the one you linked or CUETools).

MerlijnWajer commented 8 years ago

That, or the one with the highest confidence level could be picked. Why is it overkill to list both the V1 and the V2? Isn't that (more) descriptive of the confidence? Or just nice to have for possible later uploading?

MerlijnWajer commented 8 years ago

What I meant is that it may be nice to have it in the .log file, with accompanying confidence.

MerlijnWajer commented 8 years ago

By the way, as far as I could see there is no easy way to distinguish between V1 and V2 checksums in an accuraterip response - you have to match them to your (calculated) track V1 and V2 to figure out which version it is. So you would need to calculate both regardless.

JoeLametta commented 8 years ago

If the code doesn't get ugly while trying to fix this one, it's fine. The more useful informations, the better it is...

Keep in mind that if morituri gets ported to GST-1.0 the logic for the checksum may be rewritten (for example I'm referring to this one: DanielChassot/morituri@8cbb68aeb36b9d89f6645f90a109556544f5f296).

MerlijnWajer commented 8 years ago

Well, I'll see if I can submit something in the next week or so. I just created a "FastAccurateRipCheckSumTask" (that can do both V1 and V2) in common/checksum.py and the code to interact with flac and the ARC tool in program/arc.py, and it works just fine. It does not support everything - like starting at a different sample, but I am not sure if that is ever used for ARC?

The main "problem" is that the current code was written mostly with no support for versions in mind. Suggested changes: in result/logger.py will make image/image.py/AccurateRipChecksumTask return both of the checksums (possibly a tuple or namedtuple), and then have common/program.py/_verifyImageWithChecksums work with both those checksums. Then the main thing that needs to be decided is how to change result/{logger,result}.py (do we add seperate fields for V1 or V2? or do we make the AR checksums a list?). Then a few changes need to be made to rip/offset.py to make rip offset find work, and then there's probably a decent patch ready for review. And of course, the ARC tasks in morituri need to support both versions (possibly via an argument).

JoeLametta commented 8 years ago

I don't know if this can be useful to you but this is a sample log report from the morituri-whatlogger logger I'm using as default for whipper (the morituri fork I'm working at):

Ripper: morituri 0.2.3.1
Ripped at: 2015-11-30T02:51:49Z
Drive: TSSTcorpCD/DVDW SH-S183L (revision SB01)
Defeat audio cache: Yes

Read offset correction: 6
Overread: No
Gap detection: cdrdao 1.2.3

Used output format: flac
GStreamer:
  Pipeline: flacenc name=tagger quality=8
  Version: 0.10.36
  Python version: 0.10.22
  Encoder plugin version: 0.10.31

TOC:
  1:
    Start: 0:00.00
    Length: 5:13.22
    Start sector: 0
    End sector: 23496

  2:
    Start: 5:13.22
    Length: 3:34.04
    Start sector: 23497
    End sector: 39550

  3:
    Start: 8:47.26
    Length: 3:33.01
    Start sector: 39551
    End sector: 55526

  4:
    Start: 12:20.27
    Length: 3:25.19
    Start sector: 55527
    End sector: 70920

  5:
    Start: 15:45.46
    Length: 4:13.28
    Start sector: 70921
    End sector: 89923

  6:
    Start: 19:58.74
    Length: 4:38.24
    Start sector: 89924
    End sector: 110797

  7:
    Start: 24:37.23
    Length: 3:17.08
    Start sector: 110798
    End sector: 125580

  8:
    Start: 27:54.31
    Length: 4:17.14
    Start sector: 125581
    End sector: 144869

  9:
    Start: 32:11.45
    Length: 3:09.31
    Start sector: 144870
    End sector: 159075

  10:
    Start: 35:21.01
    Length: 3:25.38
    Start sector: 159076
    End sector: 174488

Tracks:
  1:
    Filename: Alan Jackson - Angels and Alcohol/01. Alan Jackson - You Can Always Come Home.flac
    Peak level: 0.988007 %
    Extraction speed: 8.6 X
    Track quality: 100.0 %
    Test CRC: C77B2904
    Copy CRC: C77B2904
    AccurateRip v1:
      Track not present in AccurateRip database
    Copy OK

  2:
    Filename: Alan Jackson - Angels and Alcohol/02. Alan Jackson - You Never Know.flac
    Peak level: 0.992462 %
    Extraction speed: 9.1 X
    Track quality: 100.0 %
    Test CRC: A065073B
    Copy CRC: A065073B
    AccurateRip v1:
      Track not present in AccurateRip database
    Copy OK

  3:
    Filename: Alan Jackson - Angels and Alcohol/03. Alan Jackson - Angels and Alcohol.flac
    Peak level: 0.940979 %
    Extraction speed: 9.7 X
    Track quality: 100.0 %
    Test CRC: EFA74CE1
    Copy CRC: EFA74CE1
    AccurateRip v1:
      Track not present in AccurateRip database
    Copy OK

  4:
    Filename: Alan Jackson - Angels and Alcohol/04. Alan Jackson - Gone Before You Met Me.flac
    Peak level: 1.000000 %
    Extraction speed: 8.4 X
    Track quality: 100.0 %
    Test CRC: FA5F67AC
    Copy CRC: FA5F67AC
    AccurateRip v1:
      Track not present in AccurateRip database
    Copy OK

  5:
    Filename: Alan Jackson - Angels and Alcohol/05. Alan Jackson - The One You're Waiting On.flac
    Peak level: 0.993896 %
    Extraction speed: 11.1 X
    Track quality: 100.0 %
    Test CRC: C96CDFB8
    Copy CRC: C96CDFB8
    AccurateRip v1:
      Track not present in AccurateRip database
    Copy OK

  6:
    Filename: Alan Jackson - Angels and Alcohol/06. Alan Jackson - Jim and Jack and Hank.flac
    Peak level: 0.993439 %
    Extraction speed: 11.6 X
    Track quality: 100.0 %
    Test CRC: 2401F2B6
    Copy CRC: 2401F2B6
    AccurateRip v1:
      Track not present in AccurateRip database
    Copy OK

  7:
    Filename: Alan Jackson - Angels and Alcohol/07. Alan Jackson - I Leave a Light On.flac
    Peak level: 0.937286 %
    Extraction speed: 11.8 X
    Track quality: 100.0 %
    Test CRC: 3D9A658E
    Copy CRC: 3D9A658E
    AccurateRip v1:
      Track not present in AccurateRip database
    Copy OK

  8:
    Filename: Alan Jackson - Angels and Alcohol/08. Alan Jackson - Flaws.flac
    Peak level: 0.999725 %
    Extraction speed: 12.3 X
    Track quality: 100.0 %
    Test CRC: 0116F116
    Copy CRC: 0116F116
    AccurateRip v1:
      Track not present in AccurateRip database
    Copy OK

  9:
    Filename: Alan Jackson - Angels and Alcohol/09. Alan Jackson - When God Paints.flac
    Peak level: 0.872742 %
    Extraction speed: 9.7 X
    Track quality: 100.0 %
    Test CRC: 388724B2
    Copy CRC: 388724B2
    AccurateRip v1:
      Track not present in AccurateRip database
    Copy OK

  10:
    Filename: Alan Jackson - Angels and Alcohol/10. Alan Jackson - Mexico, Tequila and Me.flac
    Peak level: 1.000000 %
    Extraction speed: 13.4 X
    Track quality: 100.0 %
    Test CRC: E1555D5F
    Copy CRC: E1555D5F
    AccurateRip v1:
      Track not present in AccurateRip database
    Copy OK

AccurateRip Summary:
  None of the tracks are present in the AccurateRip database

Errors:
  No errors occurred

End of status report

Log checksum: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Thanks for your work in solving this issue.

MerlijnWajer commented 8 years ago

Thanks for showing me the logger, it looks useful for my project. I have managed to get ARV1 and ARV2 to work decently in my local code base. I'm going to try and clean it up soonish. Have to wrap up a few more things first. If I seem unresponsive, I might have forgotten, do not hestitate to remind/poke me.

JoeLametta commented 8 years ago

Thanks for your work, I'll wait for the result! (and remind you if you forget)

JoeLametta commented 8 years ago

@MerlijnWajer I don't know if I'm too early but this is my reminder. :wink:

JoeLametta commented 8 years ago

@MerlijnWajer Hope It's not too early: another reminder.

MerlijnWajer commented 8 years ago

Hi Joe, it's not too early - I few 've been busy with related things. Actually, I'd like to ask you a few questions - can you drop me a message so I know how to contact you? (IRC: 'Wizzup' on freenode - I idle in the morituri channel, or email addr as on wizzup.org)

(Sorry to spam the bug report like this, hoping that off-issue discussion will lead to less spam instead)