Closed asphytheghoul closed 8 months ago
This error is occurring because your inference code is trying to run the model in fp16 on the CPU, which transformers/pytorch do not support. If you have a GPU on the machine you're running this on, I'd recommend using it - you can do that by changing model_kwargs
like so:
model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": False, "device_map": "auto"},
If you don't have a GPU then you might be able to run the model in either fp32 or bf16 precision. Try like so:
model_kwargs={"torch_dtype": torch.float32, "load_in_4bit": False}, # or torch.bfloat16
Hello, i am trying to use mergekit to make a merged model using llama-2 models that i have trained. This is the config.yaml file i am using . i am using the
dare-ties
algorithm. Please help me in this regard.I run using the following command :
mergekit-yaml config.yaml merge --copy-tokenizer --cuda --trust-remote-code
I run inference using the following code :
This is the full error :