then i tried to run the benchmark, but I encounter another error. The trace is mentioned below.
Trace:
2023-08-13 23:37:53.178162 >>> ArguAna
Traceback (most recent call last):
File "/instructor-embedding/evaluation/MTEB/mteb/evaluation/MTEB.py", line 240, in run
results = task.evaluate(model, split, kwargs)
File "/instructor-embedding/evaluation/MTEB/mteb/abstasks/AbsTaskRetrieval.py", line 660, in evaluate
results = retriever.retrieve(corpus, queries)
File "/beir/beir/retrieval/evaluation.py", line 20, in retrieve
return self.retriever.search(corpus, queries, self.top_k, self.score_function, kwargs)
File "/beir/beir/retrieval/search/dense/exact_search_multi_gpu.py", line 148, in search
cos_scores_top_k_values, cos_scores_top_k_idx = metric.compute()
File "/home/ashok/miniconda3/envs/instructor_working/lib/python3.7/site-packages/evaluate/module.py", line 433, in compute
self._finalize()
File "/home/ashok/miniconda3/envs/instructor_working/lib/python3.7/site-packages/evaluate/module.py", line 390, in _finalize
self.data = Dataset(**reader.read_files([{"filename": f} for f in file_paths]))
File "/home/ashok/miniconda3/envs/instructor_working/lib/python3.7/site-packages/datasets/arrow_reader.py", line 265, in read_files
pa_table = self._read_files(files, in_memory=in_memory)
File "/home/ashok/miniconda3/envs/instructor_working/lib/python3.7/site-packages/datasets/arrow_reader.py", line 200, in _read_files
pa_table: Table = self._get_table_from_filename(f_dict, in_memory=in_memory)
File "/home/ashok/miniconda3/envs/instructor_working/lib/python3.7/site-packages/datasets/arrow_reader.py", line 336, in _get_table_from_filename
table = ArrowReader.read_table(filename, in_memory=in_memory)
File "/home/ashok/miniconda3/envs/instructor_working/lib/python3.7/site-packages/datasets/arrow_reader.py", line 357, in read_table
return table_cls.from_file(filename)
File "/home/ashok/miniconda3/envs/instructor_working/lib/python3.7/site-packages/datasets/table.py", line 1059, in from_file
table = _memory_mapped_arrow_table_from_file(filename)
File "/home/ashok/miniconda3/envs/instructor_working/lib/python3.7/site-packages/datasets/table.py", line 66, in _memory_mapped_arrow_table_from_file
pa_table = opened_stream.read_all()
File "pyarrow/ipc.pxi", line 750, in pyarrow.lib.RecordBatchReader.read_all
File "pyarrow/error.pxi", line 115, in pyarrow.lib.check_status
OSError: Expected to be able to read 1226936 bytes for message body, got 1226928
The modified MTEB package installation seems to be broken.
I request @hongjin-su or @Harry-hash to check this out.
I'm trying to evaluate the instructor model. When following the readMe.md to install the modified MTEB package, an OSError is thrown.
Steps to reproduce: Install InstructorEmbedding as mentioned under installation.
Then following the steps under MTEB installation leads to OSError: [Errno 2] No such file or directory: '/tmp/tmpni4r6sx4/output.json as mentioned in the issue https://github.com/HKUNLP/instructor-embedding/issues/20
I guess this might be the setup.py under MTEB https://github.com/HKUNLP/instructor-embedding/blob/main/evaluation/MTEB/setup.py#L42C1-L42C1 Just exits without calling setup()
from setuptools import find_packages, setup print(find_packages()) exit(0)
So I removed the exit(0) statement and tried
pip install -e .
and it successfully installed.then i tried to run the benchmark, but I encounter another error. The trace is mentioned below.
Trace:
The modified MTEB package installation seems to be broken. I request @hongjin-su or @Harry-hash to check this out.