k2kobayashi / crank

A toolkit for non-parallel voice conversion based on vector-quantized variational autoencoder
MIT License
169 stars 31 forks source link

Objective evaluations #10

Closed unilight closed 4 years ago

unilight commented 4 years ago

Metrics

unilight commented 4 years ago

Some notes:

  1. I gave up on ASR evaluation w/ ESPnet since ESPnet does not support modular operation (i.e., I cannot just import espnet, etc.)
  2. I used this wrapper of MOSnet: https://github.com/aliutkus/speechmetrics. Its installation is also not modularized, so we have to install like pip install git+https://github.com/aliutkus/speechmetrics#egg=speechmetrics[gpu], which is hard to put inside requirements.txt. So, I added an extra line in Makefile.
  3. I modified trainer_vqvae._save_decoded_world so that the converted WORLD features are also stored. This way, we can load them for MCD calculation.
unilight commented 4 years ago

Do you mean this? https://github.com/psf/black No, I have never done it. Should I use it before I push?

k2kobayashi commented 4 years ago

Yes. Before I released crank, I apply black to all python scripts. It would be better to add ci to format with black automatically. Meanwhile, could you apply it by yourself?

unilight commented 4 years ago

I fixed your comments. Please review again.