haoheliu / voicefixer

General Speech Restoration
https://haoheliu.github.io/demopage-voicefixer/
MIT License
1.02k stars 132 forks source link

Lots of noises are added to the unspoken parts and overall quality is not worse - files provides #36

Open FurkanGozukara opened 1 year ago

FurkanGozukara commented 1 year ago

My audio is from my lecture video : https://www.youtube.com/watch?v=2zY1dQDGl3o

I want to improve overall quality to make it easier to understand

Here my raw audio : https://drive.google.com/file/d/1gGxH1J3Z_I8NNjqBvbrVB5MA0gh4qCD7/view?usp=share_link

mode 0 output : https://drive.google.com/file/d/1MRFQecxx9Ikevnsyk9Ivx6Ofr_dqdwFi/view?usp=share_link

mode 1 output : https://drive.google.com/file/d/1sva-o7Py6beEIWbcA4f0LS1-ikGmvlUC/view?usp=share_link

mode 2 output : https://drive.google.com/file/d/1sva-o7Py6beEIWbcA4f0LS1-ikGmvlUC/view?usp=share_link

for example open 1.00.40 and you will see noise

also improvement is not very good if i am not talking a lot during that part of video

check out usually the late parts of the sound files and you will see it is actually worse in mode 1 and mode 2

for example check 1.02.40 mode 1 and see noise and bad sound quality

for example check 1.32.55 mode 2 and see bad quality and noise glitches

I don't know maybe you can test and experiment with my speech to improve model even further.

thank you very much keep up the good work

haoheliu commented 1 year ago

Thank you for your interest in our model. Many apologize if it does not work as expected.

Generally, my suggestion is to use mode 0 only. But at the same time, the open-sourced model is not enough for production use and may encounter many bad cases. I wish I can improve the model but currently, I'm working on a different research topic for my PhD. When I have a chance on this topic again I'll try my best to achieve a better open-sourced version.

cweaver-logitech commented 1 year ago

I'd like to add my own experiences using voicefixer and comparing it against other methods (such as Deepfilternet2). voicefixer can do some brilliant work, very impressive on tough problems. For example, it removed the non-stationary interference both outside of speech and overlapping with little effect on the quality of speech.

Original image

voicerfixer image

At other times is adds lots of stationary noise to a seemly clean audio file.

I realise this is just a part project but, in broad terms, what would it take it improve the performance? Is it a question of a larger training dataset or it's it something more complex?

MohammedMehdiTBER commented 1 year ago

Upon further investigation, it has been determined that there is an issue with clipping at the tail and audible noise present between silences and distortion of background vocals. While Descript's Studio Sound appears to be a suitable solution, it should be noted that it also has the potential to erase background vocals.