worldveil / dejavu

Audio fingerprinting and recognition in Python
MIT License
6.36k stars 1.43k forks source link

[SQLBackend] Strange number of fingerprints saved to DB #23

Closed pguridi closed 10 years ago

pguridi commented 10 years ago

When comparing the results of the ORMBackend and the SQLBackend, I found something strange:

for the same file (tests/test1.mp3 in my fork) this is the number of rows I get saved in the fingerprint table, for each backend:

ORM: 6457 SQL: 4851

but, the strange thing is, if I print the hashes quantity just before the insert, the total is 6457. ( 3406 for first channel and 3051 for the second one).

in "database_sql.py.insert_hashes", I can confirm that is trying to insert 6457 hashes. But when I check the mysql database, I get only 4851!.

(this can be easily reproduced running the unittests).

Wessie commented 10 years ago

We use a UNIQUE constraint in the SQLBackend. As can be seen in the create table query and we ignore any constraint errors when inserting hashes as can be seen in the INSERT_FINGERPRINT query

pguridi commented 10 years ago

ahh!. good point. now with a unique constrait in the ORM backend I get only 3,765 hashes :/. Ill debug it further.., and update the tests later.

Wessie commented 10 years ago

You might've overlooked the UNIQUE constraint being on (hash, offset, sid) and not just hash.

worldveil commented 10 years ago

As the ORM is not yet a part of master (and the hash count shouldn't be a problem if the (hash, offset, sid) constraint is kept), I'm going to mark this closed.