EvolvingLMMs-Lab / lmms-eval

Accelerating the development of large multimodal models (LMMs) with lmms-eval
https://lmms-lab.github.io/
Other
1.88k stars 143 forks source link

How to set model precision when evaluating and what's the default precision? Such as "fp16" or "loat_in_8bit" #263

Open transcend-0 opened 1 month ago

transcend-0 commented 1 month ago

How to set model precision when evaluating? Such as "fp16" or "loat_in_8bit" And what's the default precision? It seems to be "fp16"? Because around 14GB GPU memory is occupied when LLaVA-1.5-7B is loaded in the evaluation process. image

abzb1 commented 1 month ago

The default dtype is auto (this may vary depending on the specific model; you can check the model code).

You have the option to pass the dtype argument to model_args when executing evaluation (float32 or float16; these are torch_dtype).

Alternatively, you can directly modify the code within the model_name.py file to utilize load_in_8bit or load_in_4bit, similar to this or this.

Simply add load_in_8bit=True:

self.model = AutoModelForCausalLM.from_pretrained(...,...,..., load_in_8bit=True,...)

quantized models from hf don't need extra args like load_in_4bit