Vision-CAIR / MiniGPT4-video

Official code for MiniGPT4-video
https://vision-cair.github.io/MiniGPT4-video/
BSD 3-Clause "New" or "Revised" License
440 stars 46 forks source link

How to conduct full training and switch to 13B for training? #6

Open Mryangkaitong opened 2 months ago

Mryangkaitong commented 2 months ago

Excellent work!!! If I want to conduct full parameter training (non-lora) on llama2 13B now, where should I modify the code in stage1-stage3 to achieve the following two things:

(1) Change the base to 13B

(2) Full parameter training

thanks

KerolosAtef commented 2 months ago

for llama 2 13B : change the llama model path in the training config file for each stage. to turn off lora and use full parameter training : comment the LORA setting in minigpt4/models/mini_gpt4_llama_v2.py

        loraconfig = LoraConfig(
            r=lora_r,
            lora_alpha=lora_alpha,
            target_modules=lora_target_modules,
            lora_dropout=lora_dropout,
            bias="none",
            task_type="CAUSAL_LM"
        )
        self.llama_model = get_peft_model(self.llama_model, loraconfig)

        self.llama_model.print_trainable_parameters()
    For CUDA memory while training on A100 with maximum batch_size=4
    ```
   self.llama_model = prepare_model_for_int8_training(self.llama_model)

       but you can also comment this line (if needed), but take care about the CUDA memory (it is not working for me even with batch size=1)