Closed Siboooo closed 1 year ago
The non-deterministic MFCC extraction is due to the "dithering" process in Kaldi, which adds a small random Gaussian noise to the input waveform. See this thread and Dan's comments https://groups.google.com/g/kaldi-help/c/LOD4A7Z9hYY/m/66ZL00fUAAAJ.
Does it affect the accuracy of the PPG and the model? If not, Why?
It should not since the acoustic model was trained to tolerate the "dithering" already. And the accent conversion model was trained with dropout, so it can also tolerate small fluctuations in the PPG signal.
Hey, bro. Thank you for sharing your great work. I was trying to extract PPG features from my own audio. But the result features are different when the input audio is the same. I figured that it's caused by the kaldi function "mfcc.compute_features" return different MFCCs from the same input. Does it affect the accuracy of the PPG and the model? If not, Why?