epfLLM / meditron

Meditron is a suite of open-source medical Large Language Models (LLMs).
https://huggingface.co/epfl-llm
Apache License 2.0
1.77k stars 159 forks source link

Mismatch in vocab_size between .bin files and .safetensors files #43

Open noahboegli opened 1 month ago

noahboegli commented 1 month ago

Hey !

I'm sorry if this is not an issue and it's just me not understanding the problem, I'm not an expert, rather a novice, in this field.

I'm trying to deploy the project according to your deployment guide. However, since I don't have access to enough memory for the -70B version of the model, I want to use the --load-8bit parameter to enable model compression. (I shall specify that I run the model using the CPU, with the --device cpu flag)

When I use this, I get the following error:

ValueError: Trying to set a tensor of shape torch.Size([32000, 8192]) in "weight" (which has shape torch.Size([32017, 8192])), this look incorrect

If I look in the HF's upload log, I see that there were two main upload of the model:

My understanding is that to enable model compression, the .bin files are needed, which do not match to the model configuration anymore.

This is supported by a manual edit of the config.json file to set vocab_size back to 32000, which allows the model to load properly using --load-8bit.