dominikschnitzer / musly

Musly - Music Similarity Library
http://www.musly.org
Mozilla Public License 2.0
138 stars 26 forks source link

Comparing two mp3 files #40

Closed doublex closed 7 years ago

doublex commented 7 years ago

Hello, Is it possible to calculate whether two MP3 files are similar or not? Thanks! Markus

f0k commented 7 years ago

Well, sort of... musly is meant to produce recommendations based on music similarity, which is not necessarily suited well for duplicate identification (assuming this is what you're after). You can use the command line tool to try it for two given files:

musly -c tmp.coll -N
musly -c tmp.coll -a file1.mp3
musly -c tmp.coll -a file2.mp3
musly -c tmp.coll -k 1 -s tmp.txt
cat tmp.txt

This will give the similarity value for the two files. You can try if this is enough for you to find duplicates, by comparing the value to some threshold. Of course, if you want to apply this in a larger scale, you will want use the library directly, not the command line.

/edit: Fixed to use separate calls for file1.mp3 and file2.mp3 after @anchengjian's note.

anchengjian commented 6 years ago
$ musly -c tmp.coll -N
Music Similarity Library (Musly) - http://www.musly.org
Version: 0.2
(c) 2013-2014, Dominik Schnitzer <dominik@schnitzer.at>
    2014-2016, Jan Schlüter <jan.schlueter@ofai.at>

Initialized music similarity method: timbre
~~~
A timbre only music similarity measure based 'mandelellis'. It
improves the basic measure in multiple ways to achieve superior
results:
We compute a single Gaussian representation from the songs
using 25 MFCCs. The similarity between two tracks is computed
with the Jensen-Shannon divergence. The Similarities are
normalized with Mutual Proximity:
D. Schnitzer et al.: Using mutual proximity to improve
content-based audio similarity. In the proceedings of the 12th
International Society for Music Information Retrieval
Conference, ISMIR, 2011.
~~~
Installed audio decoder: libav
Initializing new collection: tmp.coll
Initialization result: OK.

$ musly -c tmp.coll -a test/audio/test1-0.mp3 -a test/audio/test1-1.mp3 
Music Similarity Library (Musly) - http://www.musly.org
Version: 0.2
(c) 2013-2014, Dominik Schnitzer <dominik@schnitzer.at>
    2014-2016, Jan Schlüter <jan.schlueter@ofai.at>

Error: Invalid parameter combination!
Use '-h' for more information.

catch an error

f0k commented 6 years ago

Oh, you're right, it doesn't allow multiple -a commands. The command line tool is really basic... You'll either need to call it twice, or pass a directory to -a which will then be crawled recursively for files (supply an extension via -x if there are non-audio files you need to exclude).