quickvc / QuickVC-VoiceConversion

QuickVC: Any-to-many Voice Conversion Using Inverse Short-time Fourier Transform for Faster Conversion
MIT License
227 stars 26 forks source link

AMP usage #11

Closed tarepan closed 1 year ago

tarepan commented 1 year ago

Summary

QuickVC have AMP config.
When I use it, raise a error and cannot train.
Did you use AMP for training?

Current Status

QuickVC have AMP (automatic mixed-presicion) training flag.
https://github.com/quickvc/QuickVC-VoiceConversion/blob/277118de9c81d1689e16be8a43408eda4223553d/configs/quickvc.json#L11

This will speed up training in certain hardware, including RTX3090, which was used for the paper.

Our models are trained on a single NVIDIA 3090 GPU

When I try AMP with "fp16_run": true config, below error was raised.

RuntimeError: cuFFT only supports dimensions whose sizes are powers of two when computing in half precision, but got a signal size of[1280]

Question

If you didn't use AMP, I have additional question for future modification of QuickVC:

guoyingying432 commented 1 year ago

Sorry, I didn't use AMP training for the paper. And I think 'changing n_fft' don't have big effect on the result.

tarepan commented 1 year ago

Thanks for the answer!

skol101 commented 1 year ago

@tarepan I've checked your fork to see how AMP quickens learning. Have you tried this https://github.com/ELS-RD/kernl to accelerate even more?

tarepan commented 1 year ago

@tarepan I've checked your fork to see how AMP quickens learning. Have you tried this https://github.com/ELS-RD/kernl to accelerate even more?

@skol101 I have not tried kernl, it seems to be an accelerator of Transformer (?).
QuickVC is mostly consists of convolutions, so I imagine that acceleration is little.