resemble-ai / resemble-enhance

AI powered speech denoising and enhancement
https://huggingface.co/spaces/ResembleAI/resemble-enhance
MIT License
1.09k stars 103 forks source link

non english speech transformed to weird language #6

Open BahzBeih opened 6 months ago

BahzBeih commented 6 months ago

Peace, the non english speech transformed to weird language, i think it only work with english speech right now.

karen-pal commented 6 months ago

Experienced the same with spanish audio. Sounds kinda german after denoising it.

BahzBeih commented 6 months ago

Experienced the same with spanish audio. Sounds kinda german after denoising it.

i faced the problem with Arabic language, and i have the same problem with adobe audio enhance online tool.

enhuiz commented 6 months ago

The current model is mainly trained on English datasets and may not work as well with other languages. We hope to expand its language support in the future, and contributions are always welcome.

peili commented 6 months ago

@enhuiz Are those English datasets available anywhere?

wolfgang-wp commented 6 months ago

@enhuiz I'd like to help and contribute with other language models as well. Can you provide datasets as a reference?

anrice commented 6 months ago

Hello @enhuiz and @ZohaibAhmed ,

I've been following the discussion on the challenges faced with non-English audio processing using the resemble-enhance tool. Like others here, I attempted to train a model using German language samples. However, without adequate reference datasets or examples, the training process did not yield a reasonable model (pt).

The model's performance with German language samples was suboptimal, leading to outcomes that were not practically usable. This experience aligns with what others have reported regarding Spanish and Arabic audio processing. It seems evident that the current model's training and optimization are heavily skewed towards English datasets.

I am keen on contributing to the enhancement of the tool for better performance with non-English languages, particularly German. Any guidance on accessing suitable datasets or reference models that have been effectively trained on non-English languages would be highly beneficial. The availability of such resources would greatly aid in developing more robust and language-inclusive models.

Thank you for your efforts in creating this tool, and I look forward to any possibility of collaboration or contribution towards its improvement in handling diverse languages.

xylphe commented 6 months ago

Hello ! Same problem there with french language. Are you familiar with Mozilla's Common Voice initiative ? You could use it to train the model with other languages :)

4lvrz commented 4 months ago

Hello ! Same problem there with french language. Are you familiar with Mozilla's Common Voice initiative ? You could use it to train the model with other languages :)

Nice solution, it could do the trick!. However Common Voice is poorly supervised, and it might be a problem using deteriored samples for training enhance stage. Does anyone know if high audio quality is essential for training enhance system?

skirdey commented 3 months ago

I really like this tool for denoising, but enhancement doesn't really work on most of the samples. I found enhancement and denoising is done better in another open source project https://github.com/ruizhecao96/CMGAN which also works very well on non-english languages.

rbozan commented 2 months ago

I really like this tool for denoising, but enhancement doesn't really work on most of the samples. I found enhancement and denoising is done better in another open source project ruizhecao96/CMGAN which also works very well on non-english languages.

The demos sound great in the repo. But do you know if there's an easier tool to use this? For example, a CLI tool where I can just input a MP3 and it outputs an enhanced MP3?

kanjieater commented 1 week ago

I was hoping to use this for Japanese, but seems like I'll need to hold out.