Closed kmarekspartz closed 10 years ago
It depends what you mean. If you have labeled recordings of a speaker and want to recognize those exact recordings being played, then yes.
If however you want to recognize (1:N) or verify (1:1) a person's speech by their particular idiosyncrasies of speech, then no. dejavu
works off of a fingerprinting (read: hashing) system. Like any good hashing scheme, a small perturbation of the input (in dejavu
's case, timing and frequency) will cause very different fingerprints.
While very robust to noise, trying to recognize voice, which is not reliably the same timing or frequency each time, won't work. dejavu
is meant for recognizing exact duplicates of previously recorded audio.
I was thinking about having a long recording of an individual repeating their name many times, and then as input having them say their name once. The fingerprinting approach may be useful there. I or a friend will try it out when we get a chance and get back to you.
Thanks!
Could this system be used for speaker recognition?