auspicious3000 / autovc

AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss
https://arxiv.org/abs/1905.05879
MIT License
983 stars 207 forks source link

Follow-up work available for viewing #44

Open auspicious3000 opened 4 years ago

auspicious3000 commented 4 years ago

We have further improved AutoVC in 2 subsequent works.

The 1st work improves the audio quality by removing any pitch artifacts.

F0-consistent many-to-many non-parallel voice conversion via conditional autoencoder https://arxiv.org/abs/2004.07370

The 2nd work can convert rhythm, pitch, and/or timbre at the same time.

Unsupervised Speech Decomposition via Triple Information Bottleneck https://arxiv.org/abs/2004.11284

ibraheem-nt commented 4 years ago

Thats very impressive!

Are their plans to release the models for these new works?

auspicious3000 commented 4 years ago

Most likely, but it will probably take a long time. All these works are the intellectual properties of companies. As we know, most companies are typically very strict about releasing code.

ibraheem-nt commented 4 years ago

Understandable. Again great work and a real milestone in the field!

auspicious3000 commented 4 years ago

Thanks! Nevertheless, these works can all be reproduced based on this repository.

Immortalin commented 4 years ago

The second work is of particular interest, adding emotions to synthesized speech is still rather hit-and-miss.

qq547276542 commented 4 years ago

Your work has been amazing!

auspicious3000 commented 4 years ago

@qq547276542 Thanks! More follow-up works will be released. Stay tuned.