Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama for WhatsApp & Messenger.
I've tried to download and test the new Llama3.2-11B-Vision Model, and downloaded the model from llama.com, after I've downloaded, it told me to get the sample code from github: https://github.com/meta-llama/llama-recipes/tree/main/recipes/quickstart/inference/local_inference, but when I directly run the code [multi_modal_infer.py], the code says it will download the model from huggingface via MllamaForConditionalGeneration, not surprisingly, due to the network issue, “couldn't connect to 'https://huggingface.co' ”, and poped out several errors, btw, the youtube video is only for text compleation, I have no idea how to use the visual model. https://www.youtube.com/watch?v=a_HHryXoDjM&t=13s, which really frustrated, I don't know how to use the predownloaed model.
tbh, I don't see a clear path or sample for new users to use the model, it seems you didn't consider or test for beginners, or maybe they aren't your target audience, that's fine, so just need to wait for ollama to get the new models.
@Pancat007 the video is a little old and aimed for 3B models, please see the example script here for 3.2 inference, let me know if you have any questions!
I've tried to download and test the new Llama3.2-11B-Vision Model, and downloaded the model from llama.com, after I've downloaded, it told me to get the sample code from github: https://github.com/meta-llama/llama-recipes/tree/main/recipes/quickstart/inference/local_inference, but when I directly run the code [multi_modal_infer.py], the code says it will download the model from huggingface via MllamaForConditionalGeneration, not surprisingly, due to the network issue, “couldn't connect to 'https://huggingface.co' ”, and poped out several errors, btw, the youtube video is only for text compleation, I have no idea how to use the visual model. https://www.youtube.com/watch?v=a_HHryXoDjM&t=13s, which really frustrated, I don't know how to use the predownloaed model. tbh, I don't see a clear path or sample for new users to use the model, it seems you didn't consider or test for beginners, or maybe they aren't your target audience, that's fine, so just need to wait for ollama to get the new models.