SapienzaNLP / relik

Retrieve, Read and LinK: Fast and Accurate Entity Linking and Relation Extraction on an Academic Budget (ACL 2024)
325 stars 17 forks source link

GoldenRetriever example from README does not work #12

Open n28div opened 1 month ago

n28div commented 1 month ago

When running the snippet on the README file

from relik.retriever import GoldenRetriever

encoder_name_or_path = "sapienzanlp/relik-retriever-e5-base-v2-aida-blink-encoder"
index_name_or_path = "sapienzanlp/relik-retriever-e5-base-v2-aida-blink-wikipedia-index"

retriever = GoldenRetriever(question_encoder=encoder_name_or_path, document_index=index_name_or_path, device="cuda:0")
retriever.retrieve("Michael Jordan was one of the best players in the NBA.", top_k=100)

the code breaks with error

TypeError: Caught TypeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "[...]/python3.12/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
    data = fetcher.fetch(index)  # type: ignore[possibly-undefined]
           ^^^^^^^^^^^^^^^^^^^^
  File "[...]/python3.12/site-packages/torch/utils/data/_utils/fetch.py", line 54, in fetch
    return self.collate_fn(data)
           ^^^^^^^^^^^^^^^^^^^^^
  File "[...]/python3.12/site-packages/relik/retriever/pytorch_modules/model.py", line 381, in default_collate_fn
    _text = [sample[0] for sample in x]
             ~~~~~~^^^
TypeError: 'NoneType' object is not subscriptable

here's the full stacktrace

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[13], line 1
----> 1 retriever.retrieve("Michael Jordan was one of the best players in the NBA.", top_k=100)

File[...]/lib/python3.12/site-packages/torch/utils/_contextlib.py:115, in context_decorator.<locals>.decorate_context(*args, **kwargs)
    112 @functools.wraps(func)
    113 def decorate_context(*args, **kwargs):
    114     with ctx_factory():
--> 115         return func(*args, **kwargs)

File[...]/lib/python3.12/site-packages/torch/utils/_contextlib.py:115, in context_decorator.<locals>.decorate_context(*args, **kwargs)
    112 @functools.wraps(func)
    113 def decorate_context(*args, **kwargs):
    114     with ctx_factory():
--> 115         return func(*args, **kwargs)

File[...]/lib/python3.12/site-packages/relik/retriever/pytorch_modules/model.py:356, in GoldenRetriever.retrieve(self, text, text_pair, input_ids, attention_mask, token_type_ids, k, max_length, precision, collate_fn, batch_size, num_workers, progress_bar, **kwargs)
    354 try:
    355     with get_autocast_context(self.device, precision):
--> 356         for batch in dataloader:
    357             batch = batch.to(self.device)
    358             question_encodings = self.question_encoder(**batch).pooler_output

File[...]/lib/python3.12/site-packages/torch/utils/data/dataloader.py:631, in _BaseDataLoaderIter.__next__(self)
    628 if self._sampler_iter is None:
    629     # TODO(https://github.com/pytorch/pytorch/issues/76750)
    630     self._reset()  # type: ignore[call-arg]
--> 631 data = self._next_data()
    632 self._num_yielded += 1
    633 if self._dataset_kind == _DatasetKind.Iterable and \
    634         self._IterableDataset_len_called is not None and \
    635         self._num_yielded > self._IterableDataset_len_called:

File[...]/lib/python3.12/site-packages/torch/utils/data/dataloader.py:1346, in _MultiProcessingDataLoaderIter._next_data(self)
   1344 else:
   1345     del self._task_info[idx]
-> 1346     return self._process_data(data)

File[...]/lib/python3.12/site-packages/torch/utils/data/dataloader.py:1372, in _MultiProcessingDataLoaderIter._process_data(self, data)
   1370 self._try_put_index()
   1371 if isinstance(data, ExceptionWrapper):
-> 1372     data.reraise()
   1373 return data

File[...]/lib/python3.12/site-packages/torch/_utils.py:705, in ExceptionWrapper.reraise(self)
    701 except TypeError:
    702     # If the exception takes multiple arguments, don't try to
    703     # instantiate since we don't know how to
    704     raise RuntimeError(msg) from None
--> 705 raise exception

TypeError: Caught TypeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "[...]/python3.12/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
    data = fetcher.fetch(index)  # type: ignore[possibly-undefined]
           ^^^^^^^^^^^^^^^^^^^^
  File "[...]/python3.12/site-packages/torch/utils/data/_utils/fetch.py", line 54, in fetch
    return self.collate_fn(data)
           ^^^^^^^^^^^^^^^^^^^^^
  File "[...]/python3.12/site-packages/relik/retriever/pytorch_modules/model.py", line 381, in default_collate_fn
    _text = [sample[0] for sample in x]
             ~~~~~~^^^
TypeError: 'NoneType' object is not subscriptable

running everything on Python 3.10.12 with torch 2.1.2+cu121.

csaiedu commented 1 month ago

Getting the same error, related to this:

cannot import name 'GoldenRetriever'

from relik.retriever.pytorch_modules.model import GoldenRetriever ImportError: cannot import name 'GoldenRetriever' from partially initialized module 'relik.retriever.pytorch_modules.model' (most likely due to a circular import) in retriever\pytorch_modules\model.py

It seems that the class is missing, while GoldenRetrieverModel is there.

MedSaidi11 commented 1 month ago

Getting the same error when trying to run this on my fine-tuned model :

retriever.retrieve("Michael Jordan was one of the best players in the NBA.", top_k=100)

Riccorl commented 1 month ago

We just updated ReLiK to 1.0.7 which contains a fix for the issue. Let us know if it works now!

Getting the same error, related to this:

cannot import name 'GoldenRetriever'

from relik.retriever.pytorch_modules.model import GoldenRetriever ImportError: cannot import name 'GoldenRetriever' from partially initialized module 'relik.retriever.pytorch_modules.model' (most likely due to a circular import) in retriever\pytorch_modules\model.py

It seems that the class is missing, while GoldenRetrieverModel is there.

@csaiedu I can't replicate this problem in a fresh local environment. Let me know if the problem persists.

csaiedu commented 1 month ago

Thank you Riccorl, Upgrading leads to that error "ValueError: source code string cannot contain null bytes" on windows with fresh environment

Fixed with installation on a Linux machine and new envirnoment. Could be a corrupt conda env on windows

n28div commented 1 month ago

Thank you @Riccorl, it works just fine now

MedSaidi11 commented 1 month ago

Thank you @Riccorl , it's also working for my fine-tuned model !

Riccorl commented 1 month ago

Thank you Riccorl, Upgrading leads to that error "ValueError: source code string cannot contain null bytes" on windows with fresh environment

@csaiedu Can you share the full error stack?