Open ghost opened 7 years ago
Hello @dankorg ,
yes you can modify the code in data generation (make_tfrecords.py
where TFRecords is created) as there is a forbidden stage for any other sampling rate than 16kHz. When you have your dataset converted to TFRecords you train the model, but the canvas size will remain being 2 ** 14 samples (1 second at 16kHz), so you won't hear much speech to distinguish during training whether it goes good or bad (during GAN training we extract samples to check how is it going in G)... but could be interesting to check the results. Note that there is a pre-emphasis stage in the v1.1 model that got rid of high freq artifacts at 16kHz, but might have to get tuned for 44.1kHz.
OK, I will try it soon and let you know if it works. Thanks!
Hi,
I am just curious what kind of high frequency (HF) noise you faced which prompted you to use the pre-emphasis filter.
Note that there is a pre-emphasis stage in the v1.1 model that got rid of high freq artifacts at 16kHz, but might have to get tuned for 44.1kHz.
Would be great if you could share some samples of audio with the HF artefacts.
Regards
Hi @naba89 ,
here you can hear samples from v1.0 (with HF) and v1.1 (latest, so current weights and samples pointers in README direct you to v1.1 of the segan). v1.1: https://drive.google.com/open?id=0B6xY-R8JAa8rMHZoQ3Y4dTJIUlk v1.0: https://drive.google.com/open?id=0B6xY-R8JAa8rYlVwbksyOVg4RTQ
Regards :) Santi
Btw, it's more noticeable with headphones ofc
Thanks for the samples. I understand. Nice work btw! :)
Hi, I am very impressed with the results. However, for my applications, I am interested in 44.1kHz sampling rate speech enhancement. Is there a way to modify the code to train and test on 44.1kHz wav files? That would be awesome if possible. Thanks a lot for your hard work on this project!