Questions about the training process

Audio-WestlakeU / FullSubNet

PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."

https://fullsubnet.readthedocs.io/en/latest/

MIT License

554 stars 158 forks source link

Questions about the training process #5

Closed ahikaml closed 3 years ago

ahikaml commented 3 years ago

Very interesting project. Thank you for sharing.

I have a quastion - what are the text files noise.txt, rir.txt and clean_0.6.txt? Are they part of the original dataset or dedicated files that you've created for the training?

Another qaustion - is it possible to run it on Windows run without the "dist" feature (using a single GPU)? (I mean after commecting all parts related to the 'dist')

vvasily commented 3 years ago

I think you should create them manually for DNS-Challenge/datasets/clean and DNS-Challenge/datasets/noise dirs. The noise.txt, clean_0.6.txt contains the list of files (with absolute path) from these dirs. rir.txt is the file with RIR (Room Impulse Responses), you can download it separately somewhere. Dataset wav is : clean + noise + RIR. If you don't want to use RIR just leave rir.txt empty and set reverb_proportion to 0.0 in *.toml

haoxiangsnr commented 3 years ago

Sorry, I have not had time to deal with this issue recently. You could try it according to the @vvasily 's explanation.

I think this issue might be solved. If you any questions, please feel free to open a new issue.