ShihaoZhaoZSH / LaVi-Bridge

[ECCV 2024] Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation
MIT License
308 stars 20 forks source link

train process problem #6

Open CS123n opened 6 months ago

CS123n commented 6 months ago

Hi, I used your code to train SD+T5 on my own. However, the results deteriorated rapidly after only 500 steps. validation_500_d70a7c28fc6425ca27f8 (1) Here's what the training loss looks like: 屏幕截图 2024-03-26 175456 Do you have any advice? I tried changing the learning rate to 1e-5, but it didn't solve the problem.

ShihaoZhaoZSH commented 6 months ago

Thank you for your interest in our LaVi-Bridge! We haven't encountered such a situation in our experiment, and the released training and inference code has undergone thorough testing to ensure its correctness. We suggest checking the following points: 1. Adjust the learning rate appropriately. 2. Train using full precision. 3. Double-check the inference process to ensure the correct loading of LoRA and proper input of (un)conditional text embeddings into the adapter.