Closed xiaochunxin closed 4 years ago
That line was actually only for monitoring the number of frames and samples of audio in the dataset.
However, there are some differences between the synthesized feature samples and the speech published by the author. Is it the reason of feature extraction?
Could you explain more about the differences? If you look into the details of the values, there might be slight differences due to the floating values precision.
Traceback (most recent call last): 298 File "../../src/bin/feature_extract.py", line 400, in
299 main()
300 File "../../src/bin/feature_extract.py", line 395, in main
301 logging.info(str(arr[0])+" "+str(arr[1])+" "+str(arr[1]/arr[0])+" "+str(arr[2])+" "+str(arr[2]/arr[0]))
302 ZeroDivisionError: float division by zero
If you delete this line of code, the feature can still be extracted, and the training process can be trained. However, there are some differences between the synthesized feature samples and the speech published by the author. Is it the reason of feature extraction?