Hello, I am new to this project and trying to learn and solve my problem.
My problem is to recognize 1 hour long live FM recording files. My database belongs ~ 2500+ mp3 files, limit: 30 seconds!
randomly selected 20 files & combined them & saved first 1 hour into test file. Tried to recognize what songs where in this file.
from dejavu import Dejavu
from dejavu.recognize import FileRecognizer
with open("dejavu.cnf.SAMPLE") as f:
config = json.load(f)
if name == 'main':
fingerprint_limit : 30
# database: ~ 2500 mp3
# fingerprints: ~ 37+ million
djv = Dejavu(config)
recognizer = FileRecognizer(djv)
# randomly selected & combined ~20 songs from ~2500 mp3 files
# cut the beginning 1 hour
filename = "1.mp3"
input_song = AudioSegment.from_mp3(filename)
duration = len(input_song)/1000
start = 0
chunk_length = 20
while (start < duration):
#20-30 seconds long chunk file. pydub
chunk = input_song[start * 1000:start * 1000 + chunk_length*1000]
chunk.export("tmp.mp3", format="mp3")
res_song = recognizer.recognize_file("tmp.mp3")
if (res_song):
print "Result: %s" % repr(res_song).decode("unicode-escape")
# match_time ~ 10 - 180+ seconds !!!
# print correctly unicode song names
''' song_duration - this is the new column I added to songs table.
Related changes made to fingerprint_file, insert_song, ... functions.
I got duration value using :
try:
with audioread.audio_open(filepath) as f:
duration = f.duration
except:
duration = 0
'''
start = start + res_song['song_duration'] - 5
else:
start = start + 25 # because my fingerprint_limit : 30
`
-- this is the code I used. Hope you understand.
Total process time ~ 15 - 30 minutes (defends on chunk_length)
the problem :
Slow : fingerprint.fingerprint(samples, Fs=Fs) - this function takes almost 1 minute to work on this ~20 seconds chunk files?!?
I think similar songs allowed. In my case first 30 seconds are similar. Then false results returned. Of course if we match whole song then correct results should return.
And I want to know how to fingerprint with offset or it's staring position? Now it starts from beginning, I want it start from different points.
For example: I want to fingerprint mp3 file's - from 00:00:45 to 00:01:15 - part OR last 30 seconds. If it's possible then I could store unique songs' parts to database to avoid Problem 2.
Hello, I am new to this project and trying to learn and solve my problem. My problem is to recognize 1 hour long live FM recording files. My database belongs ~ 2500+ mp3 files, limit: 30 seconds! randomly selected 20 files & combined them & saved first 1 hour into test file. Tried to recognize what songs where in this file.
`
!/usr/bin/python
import sys import warnings import json, datetime import audioread from pydub import AudioSegment encoding="utf-8" reload(sys) sys.setdefaultencoding('utf8') warnings.filterwarnings("ignore")
from dejavu import Dejavu from dejavu.recognize import FileRecognizer
with open("dejavu.cnf.SAMPLE") as f: config = json.load(f)
if name == 'main':
fingerprint_limit : 30
` -- this is the code I used. Hope you understand. Total process time ~ 15 - 30 minutes (defends on chunk_length)
the problem :
And I want to know how to fingerprint with offset or it's staring position? Now it starts from beginning, I want it start from different points. For example: I want to fingerprint mp3 file's - from 00:00:45 to 00:01:15 - part OR last 30 seconds. If it's possible then I could store unique songs' parts to database to avoid Problem 2.