Hi,
I use 60 hours audio effect data to train the Rave model, and i have try the default v2\v3\discrete model. But i find the vae model result is too smoothing, the self-reproduce result of vae is bad. Is there anything parameter i should tune.
the discrete model training loss
the v2 model training loss
here is the origin audio.
the discrete vae output audio using onnx export.
the v2 vae output audio using onnx export.
do you find that the discrete vae is more accurate? Also how did you get the onnx export working? My models just produce thin crackly noise when the libtorch one work fine
Hi, I use 60 hours audio effect data to train the Rave model, and i have try the default v2\v3\discrete model. But i find the vae model result is too smoothing, the self-reproduce result of vae is bad. Is there anything parameter i should tune.
the discrete model training loss
the v2 model training loss
here is the origin audio. the discrete vae output audio using onnx export. the v2 vae output audio using onnx export.