Concerting checkpoints - Githubissues

AlpacaManAlpha commented 4 days ago

Checks

[X] This template is only for question, not feature requests or bug reports.
[X] I have thoroughly reviewed the project documentation and read the related paper(s).
[X] I have searched for existing issues, including closed ones, no similar questions.
[X] I confirm that I am using English to submit this report in order to facilitate communication.

Question details

When I reduce my checkpoints, will f5 use these for further training? Does it need the .pt or will .safetensors also do? Since I plan to train an assortment of speakers, storage will be an issue for me eventually.
Is there a difference between model_last and the latest checkpoint? They are not created at the same time with minutes apart.

ZhikangNiu commented 4 days ago

For Q2, you can load the checkpoint and check the keys, especially train_step

SWivid commented 4 days ago

When I reduce my checkpoints, will f5 use these for further training?

yes, but start from step=0, and all optimizer scheduler states reset

Does it need the .pt or will .safetensors also do?

just .pt is fine

Since I plan to train an assortment of speakers, storage will be an issue for me eventually.

you are using a mixed a dataset or train separate models? set a larger saving interval is fine

Is there a difference between model_last and the latest checkpoint?

the model_last is the lastest while the ckpt like model_200000.pt maybe not it is quite straightforward, that last is literally the last, 200000 is literally the ckpt for 200k step ckpt

ZhikangNiu commented 1 day ago

Will close this issue, feel free to ask question.

SWivid / F5-TTS

Concerting checkpoints #507

Checks

Question details