infly-ai / INF-MLLM

52 stars 2 forks source link

[Question] How much GPU memory is required for inference #2

Open caramel678 opened 9 months ago

caramel678 commented 9 months ago

Question I'm currently trying to use your project for some inference tasks and I'm wondering about the GPU memory requirements. Could you please provide some guidance on how much GPU memory is typically needed for inference using this model?

I'd appreciate it if you could also share any tips on how to optimize memory usage, or any methods to run the model on a GPU with less memory.

mightyzau commented 8 months ago

A 24G GPU is capable of running the InfMLLM-7B-Chat demo. To further reduce memory, you can limit the length of the history.