About inference quantization

Vision-CAIR / MiniGPT4-video

Official code for MiniGPT4-video

https://vision-cair.github.io/MiniGPT4-video/

BSD 3-Clause "New" or "Revised" License

440 stars 46 forks source link

About inference quantization #9

Open HarryHsing opened 2 months ago

HarryHsing commented 2 months ago

Thanks for your great work!

I wonder will inference quantization be supported in the future for memory efficiency?

Thanks!

KerolosAtef commented 2 months ago

Hello @HarryHsing , For inference we already did quantization by loading the language model in 8-bit by setting
low_resource: True in llama2 test config or mistral test config