unslothai / unsloth

Finetune Llama 3, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory
https://unsloth.ai
Apache License 2.0
12.64k stars 821 forks source link

Mistral inputs_embeds without causal mask raises AttributeError #374

Open namednil opened 2 months ago

namednil commented 2 months ago

To reproduce:

from unsloth import FastLanguageModel
import torch
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/mistral-7b-bnb-4bit",
    max_seq_length = 2048,
    dtype = None,
    load_in_4bit = True,
)

embedded_inputs = model.get_input_embeddings()(torch.tensor([15, 22, 30], device=0).unsqueeze(0))
model(**{"inputs_embeds": embedded_inputs})

Passing inputs_embeds instead of input_ids to forward leads to the following error:

    165 def MistralForCausalLM_fast_forward(
    166     self,
    167     input_ids: torch.LongTensor = None,
   (...)
    178     *args, **kwargs,
    179 ) -> Union[Tuple, CausalLMOutputWithPast]:
    181     if causal_mask is None and past_key_values is None:
--> 182         bsz, q_len = input_ids.shape
    183         sliding_window = getattr(self.config, "sliding_window", None)
    184         if sliding_window is None or sliding_window == "null" or sliding_window <= 0:

AttributeError: 'NoneType' object has no attribute 'shape'

It also looks like inputs_embeds would be ignored further down, not sure if that's correct:

https://github.com/unslothai/unsloth/blob/ec19e61c854dcf9104386fa63fc6c4f2944d4f35/unsloth/models/mistral.py#L205

danielhanchen commented 2 months ago

@namednil Oh nice catch on the bug - will solve this