Open TTTJJJWWW opened 5 years ago
Hi, VGGVox doesn't use MFCC, only FFT spectrum. The signal processing code is in sigproc.py
.
@linhdvu14 Hi,Thank you for your reply. I am doubtful about "VGGVox doesn't use MFCC", because the source code of VGGVOX contain the MFCC function(from MFCC folder) and use it : function [ SPEC ] = mfccspec( speech, fs, Tw, Ts, alpha, window, R, M, N, L ) % MFCC Mel frequency cepstral coefficient feature extraction. ...
Yes but if you look at the code of mfccspec
, the return value SPEC
is only FFT.
Oh I see. So you mean that the features of wav are inputed in model as a image (grey-scale image)? And the system essentially calculates the similarity (distance) of the image?
@linhdvu14 Hi, did the "weights.h5" store both the architecture and weights, or just weights? I want to convert to a TensorFlow model(.pd). Can I just use "keras_to_tensorflow" tools to do it? Look forward to your reply.
It's just weights. You'd probably want to export both weights and architecture before trying keras_to_tensorflow
. Or replicate the model architecture in tf and restore weights from a dict.
@linhdvu14 Hi, thanks for your code. I know you are using the model with weight from VGGVOX, but where is the MFCC process? Or you use different features?