seastar105 / pflow-encodec

Implementation of TTS model based on NVIDIA P-Flow TTS Paper
64 stars 5 forks source link

how to finetune from pretrained model #3

Closed eschmidbauer closed 4 months ago

eschmidbauer commented 4 months ago

Thank you for sharing this project! The inference sounds great on the multilingual_base_bs100x4.ckpt I am able to start training from a new dataset but I'm wondering if there is a way to fine-tune the pretrained model that has been released.

eschmidbauer commented 4 months ago

I did try to convert the model with this script but i get the following error

  File "venv/lib/python3.10/site-packages/lightning/pytorch/trainer/connectors/checkpoint_connector.py", line 364, in restore_optimizers_and_schedulers
    raise KeyError(
KeyError: 'Trying to restore optimizer state but checkpoint contains only the model. This is probably due to `ModelCheckpoint.save_weights_only` being set to `True`.'
seastar105 commented 4 months ago

released checkpoint is weight-only, so you can initialize model weight by using this field

https://github.com/seastar105/pflow-encodec/blob/ddf74cec7ef29b14ccc9e77f4f2611c21aedaba7/configs/experiment/multilingual_base.yaml#L30

set this field to released checkpoint. it would work. also check mean, std carefully.

eschmidbauer commented 4 months ago

thanks, that worked!