kb-it / kotori

Server- and clientside application for managing and tagging local songfiles and features for detecting duplicates
MIT License
3 stars 0 forks source link

Fingerprinting & ID3 Design #1

Open steffengy opened 6 years ago

steffengy commented 6 years ago
kb-it commented 6 years ago

Thanks for the great algorithm comparison.

It seems Panako would be the best algorithm as it has the highest accuracy and precision, is independent from play speed and performs well with fractures of songs. But Panako would introduce some serious drawbacks, as it is only available as java application and must be converted to JS/TS, which could be a quite cumbersome and time-consuming task.

Here are the relevant results from the comparison:

  • One major failing of the Chromaprint is that it requires queries to be taken from the beginning of the recording.
  • Chromaprint could not detect audio when query is given from middle section of audio.
  • Hence, Echoprint and Panako stand good choice than the Chromaprint.
  • For short signals, both Echoprint and the Panako perform well.
  • For 30 seconds start query, Panako gave more accurate results than Echoprint

Though Panako has the best recognition rate and is recommended by the author of the comparison I would not vote for it, because of the drawbacks I already mentioned.

In my opinion Echoprint could be an acceptable solution for this project, as it performs better than Chromaprint (but not as good as Panako) and there are some applications which implement some of our requirements, e.g. Node-Echoprint-Server. 😉 But still we need a solution for creating fingerprints. During my short research I just encountered a C++-Application which handles this task Echoprint-Codegen and a node wrapper, which depends on this application Echoprint-Codgen.

So what is your opinion about which algorithm to use? 🤔 I think Tag-Mapping is a no-brainer 👍

steffengy commented 6 years ago

Echoprint also has a NPM package https://www.npmjs.com/package/echoprint

I think for our use case of normal length audio songs that are mostly complete, chromaprint might still be better (and more stable)?

Especially since there is support on the database level available and implemented while I'm unsure what the story with echoprint would be. (In the worst case we'd have to compare a comparison "algorithm" ourselves)

We also could try a couple of them, if enough time is left, since they do not really require much changes from an architectural perspective.

EDIT: Echoprint seems desirable since better, we'll evaluate what works.