worldveil / dejavu

Audio fingerprinting and recognition in Python
MIT License
6.36k stars 1.43k forks source link

Distributed database #98

Open denkomanceski opened 8 years ago

denkomanceski commented 8 years ago

Do I need distributed database if I have 200 mil fingerprints or MySQL is good enough to handle this ? I have a good server (2x xeon, 32gb ram..). Also if MySQL isnt able to handle 200 mil fingerprints, which is the best approach to do this. I usually receive 1 request per second.

If a distributed database is needed than also which is best approach ?

Thank you

JPery commented 6 years ago

Do you mean 200 mil fingerprints or 200 mil songs? 200 milion songs will mean about 600 TB if my calcs are not wrong (~3MB per song * 200M songs).

The distributed database you want to use depends on your use case. It depends if you are going to insert new data and the volume of requests you are getting.

Also, you have to implement your own way to insert the data with dejavu and to request the database for matching fingerprints.