f90 / Seq-U-Net

Official implementation of the Seq-U-Net for efficient sequence modelling
MIT License
78 stars 10 forks source link

About the application #1

Open Lerry123 opened 4 years ago

Lerry123 commented 4 years ago

Hello,I have used the sequnet with residual in the speech enhancement. When used the large data in train and test,the result is so bad.Could you give some suggestion about this issue?

f90 commented 4 years ago

Hey, can you clarify more about how you applied the Seq-U-Net? I need some more info about the setting so I can give you some clues as to what the problem might be!

Lerry123 commented 4 years ago

I sampled the speech using a sliding window of about 1s, which was about 16,384 sample points. The sample points were input into the SEQ-U-NET used for speech waveform generation in the original text, and the number of channels was changed to 1. The VCTK database was used for training, the train set is 11578, and the test set is 874. When all the samples were used for training, the speech was distorted.But when I trained with 50 samples and tested with 30, there was no distortion in the enhanced speech.

Lerry123 commented 4 years ago

I have used raw_audio in original code. I wrote the input size and output size as 16384.