LoveCatc / supervised-llm-uncertainty-estimation

This repo contains code for paper: "Uncertainty Estimation and Quantification for LLMs: A Simple Supervised Approach".
8 stars 0 forks source link

Unable to load Gemma 7b model #4

Closed learningfromscratch21 closed 2 weeks ago

learningfromscratch21 commented 1 month ago

Hi,

Thanks for the amazing work. It's an interesting approach so I wanted to try it on my system.

I am just trying initially for Gemma 7b model. I can see that tensors get loaded which is divided into 4 parts. After loading the tensors it just shuts down. I am using VScode interpreter which crashes. I am using a GPU with 24 GB VRAM.

Could you please let me know the hardware configurations that you used for the experiment.

Thanks in Advance!

LoveCatc commented 1 month ago

Hi, thank you for your support. We used a GPU server with 4 x 48G Nvidia A6000 GPUs and successfully ran Gemma-7B on 2 of the cards. For minimal requirements, I suppose you consider at least ~30GB vram as I remember loading the model would take at least ~24GB vram and doing inference would take even more. The official system requirements also mention that 24GB+ RAM on GPU for the 7B checkpoint is required.

learningfromscratch21 commented 1 month ago

Thank you for the quick reply