McGill-NLP / llm2vec

Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'
https://mcgill-nlp.github.io/llm2vec/
MIT License
1.17k stars 88 forks source link

AttributeError: Can't pickle local object 'add_hook_to_module.<locals>.new_forward' #91

Closed laughinghugs closed 3 months ago

laughinghugs commented 3 months ago

As you have mentioned in your paper - Our empirical results so far as well as the analysis above share an intriguing observation: enabling bidirectional attention works well for Mistral-7B, even without any training. I was trying to load Mistral-7B-Instruct-v0.2 directly.

l2v = LLM2Vec.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2", device_map="auto", torch_dtype=torch.bfloat16, trust_remote_code=True, attn_implementation="flash_attention_2")

Now when I am trying to run the following code to encode a text -

d_reps = l2v.encode(["My name is Saikat"])

I am getting the following error -

AttributeError: Can't pickle local object 'add_hook_to_module.<locals>.new_forward'

Detailed error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[7], line 1
----> 1 d_reps = l2v.encode(["My name is Saikat"])

File /data/anaconda3/envs/saikatenv/lib/python3.11/site-packages/llm2vec/llm2vec.py:350, in LLM2Vec.encode(self, sentences, batch_size, show_progress_bar, convert_to_numpy, convert_to_tensor, device)
    345     with cuda_compatible_multiprocess.Pool(num_proc) as p:
    346         sentences_batches = [
    347             sentences_sorted[start_index : start_index + batch_size]
    348             for start_index in trange(0, len(sentences), batch_size)
    349         ]
--> 350         for result in p.map(
    351             partial(
    352                 self._encode,
    353                 # This branch only supports CUDA devices, so we ignore the value of device
    354                 # and let _encode determine it based on rank.
    355                 device=None,
    356                 convert_to_numpy=convert_to_numpy,
    357                 multiprocessing=True,
    358             ),
    359             sentences_batches,
    360         ):
    361             all_embeddings.append(result)
    363 all_embeddings = torch.cat(all_embeddings, dim=0)

File /data/anaconda3/envs/saikatenv/lib/python3.11/multiprocessing/pool.py:367, in Pool.map(self, func, iterable, chunksize)
    362 def map(self, func, iterable, chunksize=None):
    363     '''
    364     Apply `func` to each element in `iterable`, collecting the results
    365     in a list that is returned.
    366     '''
--> 367     return self._map_async(func, iterable, mapstar, chunksize).get()

File /data/anaconda3/envs/saikatenv/lib/python3.11/multiprocessing/pool.py:774, in ApplyResult.get(self, timeout)
    772     return self._value
    773 else:
--> 774     raise self._value

File /data/anaconda3/envs/saikatenv/lib/python3.11/multiprocessing/pool.py:540, in Pool._handle_tasks(taskqueue, put, outqueue, pool, cache)
    538     break
    539 try:
--> 540     put(task)
    541 except Exception as e:
    542     job, idx = task[:2]

File /data/anaconda3/envs/saikatenv/lib/python3.11/multiprocessing/connection.py:206, in _ConnectionBase.send(self, obj)
    204 self._check_closed()
    205 self._check_writable()
--> 206 self._send_bytes(_ForkingPickler.dumps(obj))

File /data/anaconda3/envs/saikatenv/lib/python3.11/multiprocessing/reduction.py:51, in ForkingPickler.dumps(cls, obj, protocol)
     48 @classmethod
     49 def dumps(cls, obj, protocol=None):
     50     buf = io.BytesIO()
---> 51     cls(buf, protocol).dump(obj)
     52     return buf.getbuffer()

AttributeError: Can't pickle local object 'add_hook_to_module.<locals>.new_forward'

Could you please help?

vaibhavad commented 3 months ago

Hi @laughinghugs,

are there multiple GPUs in your environment? I believe the code should work on a single GPU.

For multiple GPUs, it is important to encapsulate the code in __main__, please see my detailed response here.

Regarding running multi-GPU with LLM2Vec, the code need to be shielded with if name. Otherwise, CUDA runs into issues when spawning new processes. This is a requirement in sentence transformers multi-GPU support as well.

Let me know if you have any more questions.

laughinghugs commented 3 months ago

@vaibhavad - Thank you. Yes I was using multi-gpus. Now I have change device_map to 'cuda' and it worked.