resemble-ai / Resemblyzer

A python package to analyze and compare voices with deep learning
Apache License 2.0
2.75k stars 424 forks source link

Fully trainable pytorch model #67

Closed JuanFMontesinos closed 2 years ago

JuanFMontesinos commented 2 years ago

Rewrite the model in resemblyzer/voice_encoder.py to be fully dependent on pytorch (thus trainable). I don't understand the reason why your model uses numpy, which breaks backpropagation. This basically rewrites voice_encoder.py so that mel spectrogram is computed with pytorch and can be used in a end-to-end way given a waveform.

Changes:

CorentinJ commented 2 years ago

The code for training this model lies in https://github.com/CorentinJ/Real-Time-Voice-Cloning. This repo is for inference only.

Only the forward function is written in pytorch. The only use for autograd here is for backpropagating forward() for a loss function.

JuanFMontesinos commented 2 years ago

Thanks, though the repo was abandoned. Anyway keeping a backpropagable version can be interesting for other applications.

Regards