Closed Hassan1175 closed 5 years ago
But padding adds zeros in the data. So why we are aimed to equalize the length of the audio signal with zeros?
We are adding zeros and sometimes slicing the data to make sure that all samples will produce the feature vectors or same size.
So why we are aimed to equalize the length of the audio signal with zeros?
Adding zeros is adding silence to the voice data. So it ideally shouldn't change anything in the data.
Make sense. Why we not get the longest length in our data, as our standard length instead of mean length? In that way we will not need to slice the data, result in not losing any data. .isn't it?
The rationale is that, the longest signal in the data often contains a lot of noise and if we pad everything to longest, it will involve a lot of computation. You are free to change the parameter, and try for your dataset with maximum length. If that gives you better results, you can proceed with that.
Got it. Thanks a lot for your reply.
I know the length of voice signals vary from file to file or you can say that there may b some outliers in the data set. But padding adds zeros in the data. So why we are aimed to equalize the length of the audio signal with zeros? If we are adding zeros in the data, will it distort original data, if yes then we are padding the data?
My second question is that how voice data is normalized, did you normalize the data in current project?