maum-ai / voicefilter

Unofficial PyTorch implementation of Google AI's VoiceFilter system
http://swpark.me/voicefilter
1.08k stars 227 forks source link

Real-time inference #3

Closed kyungjin-lee closed 5 years ago

kyungjin-lee commented 5 years ago

Hi, I'd like to use this voice filtering in real-time. Would it be possible to modify the inference code to run the model in real time for audio PCM data?

seungwonpark commented 5 years ago

Hi, @kyungjin-lee Yes, but it depends on what you mean by saying “real-time”.

On V100 GPU, I checked that the inferencing time is shorter than the length of the audio. However, if you meant streaming, you will have to change BiLSTM to unidirectional LSTM, and some other things should be changed. Model should be trained again if then of course.

kyungjin-lee commented 5 years ago

Yes, I did mean streaming. I'll give it a shot. Thanks!

xiaozhuo12138 commented 4 years ago

If CNN input only one frame of data at a time, will the effect be very bad?

johannesmols commented 3 years ago

Yes, I did mean streaming. I'll give it a shot. Thanks!

Did you have any luck with this? I am also interested in a real-time application of this.