facebookresearch / svoice

We provide a PyTorch implementation of the paper Voice Separation with an Unknown Number of Multiple Speakers In which, we present a new method for separating a mixed audio sequence, in which multiple voices speak simultaneously. The new method employs gated neural networks that are trained to separate the voices at multiple processing steps, while maintaining the speaker in each output channel fixed. A different model is trained for every number of possible speakers, and the model with the largest number of speakers is employed to select the actual number of speakers in a given sample. Our method greatly outperforms the current state of the art, which, as we show, is not competitive for more than two speakers.
Other
1.25k stars 179 forks source link

the bee---------- sound? #71

Open cchoi1022 opened 2 years ago

cchoi1022 commented 2 years ago

There's a high pitched bi----- or phi--- or fi---- sound in the separated audio when I use custom data to train this. I'm not sure why, but it's quite disruptive, and I think it's affecting the evaluation results. How do I fix this?

joeoct93 commented 2 years ago

I also have this problem. Is there a solution to this?

qalabeabbas49 commented 1 year ago

I don't know how long you trained the model. but I have experienced that if your data is good then the bee---- sound disappears later in training.