bshall / UniversalVocoding

A PyTorch implementation of "Robust Universal Neural Vocoding"
https://bshall.github.io/UniversalVocoding/
MIT License
237 stars 41 forks source link

Result with other datasets #6

Open tarepan opened 5 years ago

tarepan commented 5 years ago

Summary

I will share my result of the Universal Vocoder in other datasets.

Thanks for your great library and impressive result/demo.
It seems that you are interested in other datasets (#2), I will share my result. (if not interested, please feel free to ignore!)

I forked this repository and used this for other dataset, JSUT (Japanese single female utterances, total 10 hours).
Though the model trained on single female speaker, it works very well even toward out-of-domain speaker's test data (other females, male, and even English speaker).
Below is the result/demo.
https://tarepan.github.io/UniversalVocoding

In my impression, RNN_MS (Universal Vocoder) seems to learn utterances from human mouth/vocalTract, which is independent from language. So interesting.

I am grad if my result is good for your further experiments.
Again, thanks for your great library.

bshall commented 5 years ago

Hi @tarepan,

That's a great result, thanks for sharing.

It's really interesting that the out of domain english speaker is quite a lot more noisy than the out of domain Japanese speakers. I think it would be great to train a model on a dataset with multiple languages (like they do in the paper) and compare to that.