higher quality enhancement?

santi-pdp / segan

Speech Enhancement Generative Adversarial Network in TensorFlow

MIT License

810 stars 281 forks source link

higher quality enhancement? #9

Open ghost opened 7 years ago

ghost commented 7 years ago

Hi, I am very impressed with the results. However, for my applications, I am interested in 44.1kHz sampling rate speech enhancement. Is there a way to modify the code to train and test on 44.1kHz wav files? That would be awesome if possible. Thanks a lot for your hard work on this project!

santi-pdp commented 7 years ago

Hello @dankorg ,

yes you can modify the code in data generation (make_tfrecords.py where TFRecords is created) as there is a forbidden stage for any other sampling rate than 16kHz. When you have your dataset converted to TFRecords you train the model, but the canvas size will remain being 2 ** 14 samples (1 second at 16kHz), so you won't hear much speech to distinguish during training whether it goes good or bad (during GAN training we extract samples to check how is it going in G)... but could be interesting to check the results. Note that there is a pre-emphasis stage in the v1.1 model that got rid of high freq artifacts at 16kHz, but might have to get tuned for 44.1kHz.

ghost commented 7 years ago

OK, I will try it soon and let you know if it works. Thanks!

naba89 commented 7 years ago

Hi,

I am just curious what kind of high frequency (HF) noise you faced which prompted you to use the pre-emphasis filter.

Note that there is a pre-emphasis stage in the v1.1 model that got rid of high freq artifacts at 16kHz, but might have to get tuned for 44.1kHz.

Would be great if you could share some samples of audio with the HF artefacts.

Regards

santi-pdp commented 7 years ago

Hi @naba89 ,

here you can hear samples from v1.0 (with HF) and v1.1 (latest, so current weights and samples pointers in README direct you to v1.1 of the segan). v1.1: https://drive.google.com/open?id=0B6xY-R8JAa8rMHZoQ3Y4dTJIUlk v1.0: https://drive.google.com/open?id=0B6xY-R8JAa8rYlVwbksyOVg4RTQ

Regards :) Santi

santi-pdp commented 7 years ago

Btw, it's more noticeable with headphones ofc

naba89 commented 7 years ago

Thanks for the samples. I understand. Nice work btw! :)