How to conduct full training and switch to 13B for training?

Vision-CAIR / MiniGPT4-video

Official code for Goldfish model for long video understanding and MiniGPT4-video for short video understanding

BSD 3-Clause "New" or "Revised" License

559 stars 60 forks source link

for llama 2 13B : change the llama model path in the training config file for each stage. to turn off lora and use full parameter training : comment the LORA setting in minigpt4/models/mini_gpt4_llama_v2.py

        loraconfig = LoraConfig(
            r=lora_r,
            lora_alpha=lora_alpha,
            target_modules=lora_target_modules,
            lora_dropout=lora_dropout,
            bias="none",
            task_type="CAUSAL_LM"
        )
        self.llama_model = get_peft_model(self.llama_model, loraconfig)

        self.llama_model.print_trainable_parameters()

    For CUDA memory while training on A100 with maximum batch_size=4
    ```
   self.llama_model = prepare_model_for_int8_training(self.llama_model)


       but you can also comment this line (if needed), but take care about the CUDA memory (it is not working for me even with batch size=1)

Vision-CAIR / MiniGPT4-video

How to conduct full training and switch to 13B for training? #6