Questions about base models' training time, loss, etc.

ruby11dog commented 2 weeks ago

Checks

[X] This template is only for question, not feature requests or bug reports.
[X] I have thoroughly reviewed the project documentation and read the related paper(s).
[X] I have searched for existing issues, including closed ones, no similar questions.
[X] I confirm that I am using English to submit this report in order to facilitate communication.

Question details

hello, 非常棒的工作！我在基础模型上进行了一些模型结构的调整，并在Emilia数据集上重新开始训练底模，想问下，loss到多少的时候模型开始可以正常发声呢？你们在训练开源的这个底模的时候，用了多少机器和时间能达到正常发声呢？多少时间能到收敛呢？

SWivid commented 2 weeks ago

Hi @ruby11dog , we could use English to submit this report in order to facilitate communication. (Checks 4.)

Loss is not significant to see how training process goes as pred and gt boundries are mismatched. #9

We have posed all results and detailes of training and evaluation in our paper. For base model 8*A100 80G over one week to reach 1.2M updates, 200~400k to hear some aligned speech (say, intelligible)

SWivid commented 2 weeks ago

using English to submit this report in order to facilitate communication.

SWivid / F5-TTS

Questions about base models' training time, loss, etc. #406

Checks

Question details