Trainable Parameters and their precisions

Alpha-VLLM / Lumina-mGPT

Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining"

507 stars 22 forks source link

Thanks for your great work!

During training, I found that all parameters are trainable and set to fp32 precision since the ChameleonXLLMXForConditionalGeneration class doesn't have a get_trainable_params method. I wonder whether all parameters require training during the 3-stage FP-SFT and whether fp32 precision for all parameters is necessary?

The relevant code can be found at https://github.com/Alpha-VLLM/Lumina-mGPT/blob/104abe453ec1acca5863698629c4db2111b0b3fc/xllmx/solvers/finetune/finetune.py#L286-L294

Alpha-VLLM / Lumina-mGPT

Trainable Parameters and their precisions #17