dhowden / tag

ID3, MP4 and OGG/FLAC metadata parsing in Go
BSD 2-Clause "Simplified" License
558 stars 73 forks source link

Same mp3 files with slightly different tags do not produce the same sum #53

Closed ugjka closed 4 years ago

ugjka commented 5 years ago

The 2 offending files in a zip file: https://dl.ugjka.net/sumtest.zip

Code:

package main

import (
    "fmt"
    "os"

    "github.com/dhowden/tag"
)

func main() {
    f, _ := os.Open("test.mp3")
    f2, _ := os.Open("test2.mp3")
    fmt.Println(tag.Identify(f))
    fmt.Println(tag.Sum(f))
    fmt.Println(tag.Identify(f2))
    fmt.Println(tag.Sum(f2))
}

Output:

ID3v2.3 MP3 <nil>
533abd81b8664e1371f7013f6453d2cce5ed28f5 <nil>
ID3v2.3 MP3 <nil>
741efd6ba8e88a3d1278160b952ac4b634edc684 <nil>
wader commented 5 years ago

There is something fishy going on in sizeToEndOffset https://github.com/dhowden/tag/blob/master/sum.go#L102 used by SumID3v2. The seeking seems to be wrong somehow, not sure i understand how that function i suppose to be used

ugjka commented 5 years ago

I'm pretty sure this is abandoned project. I forked the project and there are so many things wrong

I also found out that Google Music mp3 files also have APEv2 tags at the end of file which messes up even my own fixed check-summing.

I think I will try to figure out how to extract just the codec data and checksum that

wader commented 5 years ago

Mm not much activity so a fork fixing and collecting fixes from various other forks would be great

dhowden commented 5 years ago

Yeah, apologies. I haven't had spare time to test/merge changes lately.

The "Sum" functionality was only implemented to de-dupe my own collection of mp3/m4a etc files so was designed/tested on those. It's likely there are bunch of problems.

I'm hope to have some time in the next few weeks to go through the issues and tidy up some of the problems.

wader commented 5 years ago

@dhowden No worries. For my use case the package is working great so many thanks!