Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama for WhatsApp & Messenger.
Notice that in the original paper "The Llama 3 Herd of Models", section 7.5.2 on vision model SFT states that only the vision encoder and image adapter weights should be updated, while the LLM weights remain frozen.
However, in the fine-tuning recipe for vision models that you provided, it seems like all LLM weights are being tuned. Is this an oversight, or are you working on updating the training script to only tune the vision encoder and image adapter?
🚀 The feature, motivation and pitch
Notice that in the original paper "The Llama 3 Herd of Models", section 7.5.2 on vision model SFT states that only the vision encoder and image adapter weights should be updated, while the LLM weights remain frozen.
However, in the fine-tuning recipe for vision models that you provided, it seems like all LLM weights are being tuned. Is this an oversight, or are you working on updating the training script to only tune the vision encoder and image adapter?
Alternatives
No response
Additional context
No response