System requirements for running the model ?

FoundationVision / Groma

[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization

https://groma-mllm.github.io/

Apache License 2.0

483 stars 55 forks source link

System requirements for running the model ? #1

Closed learnermaxRL closed 2 months ago

learnermaxRL commented 2 months ago

Can you please detail the system requirements , can this run on mac m2 air ?

machuofan commented 2 months ago

Hi, Groma-7b takes 30-40G memory for inference on a single GPU. But we have not tested it on CPU. I guess you may need to quantize the model as in LLaVA to make it run on Mac.