bshall / knn-vc

Voice Conversion With Just Nearest Neighbors
https://bshall.github.io/knn-vc/
Other
431 stars 64 forks source link

WavLM Base+ over Large? #10

Closed Pathos0925 closed 1 year ago

Pathos0925 commented 1 year ago

First, thanks for the paper and the code, this is very interesting! Did you happen to do any testing with other versions of WavLM, such as Base or Base+? I was wondering if it would be possible to make this lighter without impacting the quality too much.

bshall commented 1 year ago

Hi @Pathos0925, thanks for the feedback!

Initially we used HuBERT-Base as the feature extractor. It also works well, maybe slightly worse than WavLM but not by much. I'm not sure if that's due to size or to the training procedure though. We haven't experimented with WavLM-Base or Base+, but I'm confident that they'll work as well. If you do experiment with a smaller model we'd be interested, so please share your results!