potsawee / selfcheckgpt

SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models
MIT License
467 stars 54 forks source link

Which version of LLaMA model is used? #20

Closed Moximixi closed 11 months ago

Moximixi commented 11 months ago

image As far as I know, the LLaMA model contains four versions: 7b, 13b, 33b and 65b. Which version does the figure refer to? Another question is what type of GPU is used to run llama_logrob_inference.py?

potsawee commented 11 months ago

Hi @Moximixi

I used LLaMA-1 (decapoda-research-llama-30b-hf) -- I believe it's 30B.

If you run at using 16-bit you will need around 60GB, or 120GB for full precision. Running inference just to get the probabilities (instead of doing decoding inference) can be fast and I actually ran this experiment on CPU (should be done in 10-30 mins depending on your machine).