SWivid / F5-TTS

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
https://arxiv.org/abs/2410.06885
MIT License
7.48k stars 924 forks source link

Why is Gradient Checkpointing Not Implemented in Training? #399

Open kostum123 opened 3 weeks ago

kostum123 commented 3 weeks ago

Checks

Question details

It appears that gradient checkpointing is not implemented in the current training pipeline. Gradient checkpointing can significantly reduce memory usage by trading off computation, making it valuable for large models and resource-limited environments. This raises the question:

Is there a specific reason for not implementing gradient checkpointing? If possible, could it be integrated in future updates, or are there known limitations that prevent its integration? If there is no compatibility issue, I would be open to exploring the possibility of adding it via a PR.

ZhikangNiu commented 3 weeks ago

Yeah, I think you can explore the gradient checkpointing in F5 and add it via a PR.