jonathanstowe / Audio-Fingerprint-Chromaprint

Get audio fingerprint using the chromaprint / AcoustID library
Artistic License 2.0
2 stars 2 forks source link

Please add chromaprint_encode_fingerprint and chromaprint_decode_fingerprint #3

Open avuserow opened 10 months ago

avuserow commented 10 months ago

These two functions would let me store compressed fingerprints in the database and the decompress for comparison. These are very effective when compressing a fingerprint, especially if you can store the raw (non-base64) version, something like a 4:1 compression ratio.

I would also be interested in the comparison functions, but I read that the chromaprint comparison API is not fully implemented and not exposed in the C headers. I didn't have good luck when trying to use it either.

I do have a simple comparison algorithm that I've seen in a few other projects that I could contribute. It's written in C and based on algorithms that I've seen in other projects (https://codeberg.org/derat/soundalike for one) so maybe it's better off as a secondary module so this one does not require a C compiler :shrug:

jonathanstowe commented 8 months ago

Thanks, I'll take a look. I think that's entirely doable.

jonathanstowe commented 8 months ago

Having looked at this, I just realised that infact the result of .fingerprint is the encoded fingerprint encoded as a string. So whilst the encode and decode methods might be useful, the ability to get the raw fingerprint directly as 32 bit ints may be more useful for the soundalike thing. Also if you are doing bulk comparisons, the hash function may be useful to discard completely dissimilar files before doing anything more intensive.

avuserow commented 8 months ago

I haven't worked with this code in a bit, but I remember not finding hash very useful for comparison. I did write a compare algorithm in C based on a few other projects, and it does alright. I can give you a C implementation if that's something you want to try out.

I did find a project that implemented a way to do a quick comparison across a very large amount of files. It's written in golang and uses a hash table in memory, and I adapted this approach to using a SQLite database. If that's of any interest, I can provide a link to that project. (That seems a bit out of scope for your module but maybe it will be interesting to you in some other area.)

On Sun, Mar 10, 2024, 16:26 Jonathan Stowe @.***> wrote:

Having looked at this, I just realised that infact the result of .fingerprint is the encoded fingerprint encoded as a string. So whilst the encode and decode methods might be useful, the ability to get the raw fingerprint directly as 32 bit ints may be more useful for the soundalike thing. Also if you are doing bulk comparisons, the hash function may be useful to discard completely dissimilar files before doing anything more intensive.

— Reply to this email directly, view it on GitHub https://github.com/jonathanstowe/Audio-Fingerprint-Chromaprint/issues/3#issuecomment-1987367808, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAUCSUD6QPK3QNOGA3TY33YXTFZ7AVCNFSM6AAAAABBEY7PYOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBXGM3DOOBQHA . You are receiving this because you authored the thread.Message ID: @.*** com>

jonathanstowe commented 8 months ago

I've added encode-fingerprint, decode-fingerprint methods as well as enable access to the raw calculated fingerprint.

I think it's probably add any additional features not provided by libchromaprint as separate modules.