Error in _update_causal_mask while running the example code

aldrinjenson commented 4 months ago

Hi thanks for this project! I tried running the example code on my nvidia GPU and also separately on a google colab notebook

Each time when running the encode command q_reps = l2v.encode(queries), I am getting the following error: TypeError: LlamaBiModel._update_causal_mask() takes from 4 to 5 positional arguments but 6 were given

Note that I am running the same code as in the readMe example.

I remember testing this same code a week before and at that time it was working perfectly. Please let me know if this is because of some recent changes being made in pypi about any dependency libraries; or if there's any other way I can get this running fine.

Thank You!

vaibhavad commented 4 months ago

Hi @aldrinjenson,

This issue occurs with transformers version > 4.40 and llm2vec version < 0.15. Here is the link to discussion. Please make sure you are using the latest version of llm2vec.

If the issue still persists, do let me know the version of llm2vec and transformers, so that I can try to replicate the issue

ChristopherLR commented 4 months ago

@vaibhavad still seems to be failing on main: (transformers == 4.41.0)

TypeError                                 Traceback (most recent call last)
Cell In[3], line 8
      1 instruction = (
      2     "Given a web search query, retrieve relevant passages that answer the query:"
      3 )
      4 queries = [
      5     [instruction, "how much protein should a female eat"],
      6     [instruction, "summit define"],
      7 ]
----> 8 q_reps = l2v.encode(queries)
     10 # Encoding documents. Instruction are not required for documents
     11 documents = [
     12     "As a general guideline, the CDC's average requirement of protein for women ages 19 to 70 is 46 grams per day. But, as you can see from this chart, you'll need to increase that if you're expecting or training for a marathon. Check out the chart below to see how much protein you should be eating each day.",
     13     "Definition of summit for English Language Learners. : 1  the highest point of a mountain : the top of a mountain. : 2  the highest level. : 3  a meeting or series of meetings between the leaders of two or more governments.",
     14 ]

File /code/libs/llm2vec/llm2vec/llm2vec.py:337, in LLM2Vec.encode(self, sentences, batch_size, show_progress_bar, convert_to_numpy, convert_to_tensor, device)
    327     for start_index in trange(
    328         0,
    329         len(sentences),
   (...)
    332         disable=not show_progress_bar,
    333     ):
    334         sentences_batch = sentences_sorted[
    335             start_index : start_index + batch_size
    336         ]
--> 337         embeddings = self._encode(
    338             sentences_batch, device=device, convert_to_numpy=convert_to_numpy
    339         )
    340         all_embeddings.append(embeddings)
    341 else:

File /code/libs/llm2vec/llm2vec/llm2vec.py:407, in LLM2Vec._encode(self, sentences_batch, device, convert_to_numpy, multiprocessing)
    404 features = batch_to_device(features, device)
    406 with torch.no_grad():
--> 407     embeddings = self.forward(features)
    408     embeddings = embeddings.detach()
    409     embeddings = embeddings.cpu()

File /code/libs/llm2vec/llm2vec/llm2vec.py:207, in LLM2Vec.forward(self, sentence_feature)
    205 if "embed_mask" in sentence_feature:
    206     embed_mask = sentence_feature.pop("embed_mask")
--> 207 reps = self.model(**sentence_feature)
    208 sentence_feature["embed_mask"] = embed_mask
    210 return self.get_pooling(sentence_feature, reps.last_hidden_state)

File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1511, in Module._wrapped_call_impl(self, *args, **kwargs)
   1509     return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
   1510 else:
-> 1511     return self._call_impl(*args, **kwargs)

File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1520, in Module._call_impl(self, *args, **kwargs)
   1515 # If we don't have any hooks, we want to skip the rest of the logic in
   1516 # this function, and just call forward.
   1517 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1518         or _global_backward_pre_hooks or _global_backward_hooks
   1519         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1520     return forward_call(*args, **kwargs)
   1522 try:
   1523     result = None

File /opt/conda/lib/python3.10/site-packages/peft/peft_model.py:642, in PeftModel.forward(self, *args, **kwargs)
    640 with self._enable_peft_forward_hooks(*args, **kwargs):
    641     kwargs = {k: v for k, v in kwargs.items() if k not in self.special_peft_forward_args}
--> 642     return self.get_base_model()(*args, **kwargs)

File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1511, in Module._wrapped_call_impl(self, *args, **kwargs)
   1509     return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
   1510 else:
-> 1511     return self._call_impl(*args, **kwargs)

File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1520, in Module._call_impl(self, *args, **kwargs)
   1515 # If we don't have any hooks, we want to skip the rest of the logic in
   1516 # this function, and just call forward.
   1517 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1518         or _global_backward_pre_hooks or _global_backward_hooks
   1519         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1520     return forward_call(*args, **kwargs)
   1522 try:
   1523     result = None

File /opt/conda/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py:940, in LlamaModel.forward(self, input_ids, attention_mask, position_ids, past_key_values, inputs_embeds, use_cache, output_attentions, output_hidden_states, return_dict, cache_position)
    937 if position_ids is None:
    938     position_ids = cache_position.unsqueeze(0)
--> 940 causal_mask = self._update_causal_mask(
    941     attention_mask, inputs_embeds, cache_position, past_key_values, output_attentions
    942 )
    944 # embed positions
    945 hidden_states = inputs_embeds

TypeError: LlamaBiModel._update_causal_mask() takes from 4 to 5 positional arguments but 6 were given

vaibhavad commented 4 months ago

@ChristopherLR,

My apologies, I missed this version as it came out recently (5 days ago). I have fixed llm2vec code to support this version. If you do pip install llm2vec==0.1.8, or build from source, it should support transformers 4.41.0 and this error should not appear.

Let me know if you have any more questions.

aldrinjenson commented 4 months ago

pip install llm2vec==0.1.8

This worked. Thanks @vaibhavad !!

McGill-NLP / llm2vec

Error in _update_causal_mask while running the example code #82