[Feature] Support Llama 3.2 family of models

vikrantrathore commented 1 month ago

Motivation for Implementing Llama 3.2 lmdeploy Inference Engine Support

Llama 3.2's release presents a strong case for expanding lmdeploy with dedicated support:

Multi-modal Capabilities: Llama 3.2 introduces vision LLMs (11B & 90B) enabling image understanding and reasoning alongside text. Lmdeploy integration unlocks applications like document analysis, image captioning, and visual grounding.
Edge & Mobile Deployment: Lightweight 1B & 3B models are tailored for on-device use cases (summarization, instructions, rewriting). Lmdeploy can facilitate faster, private, and offline AI on resource-constrained hardware.
Enhanced Performance: Pruning and distillation techniques in Llama 3.2 result in improved efficiency. Lmdeploy users would benefit from faster inference and reduced resource consumption with these models.
Open Innovation: Supporting Llama 3.2 aligns with lmdeploy's open-source ethos, fostering a wider ecosystem and accelerating AI application development.

No response

lvhan028 commented 1 month ago

Sure. @AllentDan is working on it.

lijiawei320 commented 1 month ago

@AllentDan Is there any progress?Thanks♪(･ω･)ﾉ

lvhan028 commented 2 weeks ago

Please upgrade to v0.6.2