VinAIResearch / PhoGPT

PhoGPT: Generative Pre-training for Vietnamese (2023)
Apache License 2.0
739 stars 67 forks source link

Hardware specification for inference #2

Closed leviethung2103 closed 10 months ago

leviethung2103 commented 10 months ago

Hello,

The model has been trained on the A100 GPU. However, I am wondering about the GPU memory cost during inference.

Currently, I have a 3060 GPU with 12 VRAM. Can it be used for running inference?

Thank you

datquocnguyen commented 10 months ago

You can just load the model in 8-bit, taking about 7.5GB of memory: load_in_8bit=True https://huggingface.co/docs/transformers/main/en/main_classes/quantization