First of all, thank you so much for this repository. I am doing some research in the speech domain, and this has been very helpful.
But, I have some doubts regarding the same.
1) Why extracted windows are slightly above 1 second and not exactly 1 second?
2) Can this 1 second be increased to more number of seconds? How will this affect the training?
First of all, thank you so much for this repository. I am doing some research in the speech domain, and this has been very helpful.
But, I have some doubts regarding the same.
1) Why extracted windows are slightly above 1 second and not exactly 1 second? 2) Can this 1 second be increased to more number of seconds? How will this affect the training?
Thanks in advance.