facebookresearch / contriever

Contriever: Unsupervised Dense Information Retrieval with Contrastive Learning
Other
673 stars 59 forks source link

MemoryError: std::bad_alloc #20

Open Xnhyacinth opened 12 months ago

Xnhyacinth commented 12 months ago

Hi, @likethesky when i run python passage_retrieval.py \ --model_name_or_path facebook/contriever \ --passages psgs_w100.tsv \ --passages_embeddings "contriever_embeddings/*" \ --data nq_dir/test.json \ --output_dir contriever_nq \ to retrieve top-100 passages, i meet the error MemoryError: std::bad_alloc, how can i solve it?

The full error is as follows:

│ 27 │ │ embeddings = embeddings.astype('float32') │ │ 28 │ │ if not self.index.is_trained: │ │ 29 │ │ │ self.index.train(embeddings) │ │ ❱ 30 │ │ self.index.add(embeddings) │ │ 31 │ │ │ │ 32 │ │ print(f'Total data indexed {len(self.index_id_to_db_id)}') │ │ 33 │ │ │ │ /usr/local/anaconda3/envs/py39/lib/python3.9/site-packages/faiss/class_wrapp │ │ ers.py:230 in replacement_add │ │ │ │ 227 │ │ n, d = x.shape │ │ 228 │ │ assert d == self.d │ │ 229 │ │ x = np.ascontiguousarray(x, dtype='float32') │ │ ❱ 230 │ │ self.add_c(n, swig_ptr(x)) │ │ 231 │ │ │ 232 │ def replacement_add_with_ids(self, x, ids): │ │ 233 │ │ """Adds vectors with arbitrary ids to the index (not all index │ │ │ │ /usr/local/anaconda3/envs/py39/lib/python3.9/site-packages/faiss/swigfaiss_a │ │ vx2.py:2331 in add │ │ │ │ 2328 │ │ │ 2329 │ def add(self, n, x): │ │ 2330 │ │ r""" default add uses sa_encode""" │ │ ❱ 2331 │ │ return _swigfaiss_avx2.IndexFlatCodes_add(self, n, x) │ │ 2332 │ │ │ 2333 │ def reset(self): │ │ 2334 │ │ return _swigfaiss_avx2.IndexFlatCodes_reset(self) │ ╰──────────────────────────────────────────────────────────────────────────────╯ MemoryError: std::bad_alloc

Thanks a lot!