huggingface / huggingface-llama-recipes

531 stars 59 forks source link

Question on loading llama 405B FP8 using HF transformer API #76

Open Neo9061 opened 4 weeks ago

Neo9061 commented 4 weeks ago

Based on this notebook: https://github.com/huggingface/huggingface-llama-recipes/blob/main/local_inference/fp8-405B.ipynb

since we are loading FP8, will that matter if we specify data_type to be torch.bfloat16?

CC @ianporada who made recent edits in this notebook

ianporada commented 4 weeks ago

I believe it doesn't matter. Even if you didn't specify torch_dtype it would default to the config.json value which is also "torch_dtype": "bfloat16".

Keep in mind not all weights are FP8, some weights of the quantized model are still BF16.