capreolus-ir / capreolus

A toolkit for end-to-end neural ad hoc retrieval
https://capreolus.ai
Apache License 2.0
95 stars 32 forks source link

unsafe memory access error #151

Closed Tooba-ts1700550 closed 3 years ago

Tooba-ts1700550 commented 3 years ago

I'm running the reranker KNRM on msmarcopsg, Can someone help with this error:

Exception in thread "pool-2-thread-2" java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code

Thank you.

andrewyates commented 3 years ago

This is happening somewhere inside Anserini. Is this on Colab? Can you verify that Java version 11 is installed?

Tooba-ts1700550 commented 3 years ago

Yes this is in Colab, and I checked the java version, it is 11.0.10. The program continued running in other threads, after these errors. However, I am not sure why, maybe colab reaches its limit, the program just stops randomly during the search phase on msmarcopsg as it takes too long to search through it. This is why I'm trying to use the TPU.

andrewyates commented 3 years ago

Yeah, I would guess that you're hitting some kind of limit.

The search phase takes a long time on that branch, which is one of the reasons it isn't merged yet. You could avoid this phase entirely by using an existing BM25 run file instead of doing the search yourself. See here for example: https://github.com/capreolus-ir/capreolus/blob/master/capreolus/searcher/anserini.py#L270

Tooba-ts1700550 commented 3 years ago

I am getting an error when I run this using the existing BM25

!capreolus rerank.traineval with \
  reranker.trainer.tpuname="COLAB" reranker.trainer.tpuzone="COLAB" reranker.trainer.storage="gs://capreolus-bucket/cap-results/" \
  rank.searcher.index.stemmer=porter benchmark.name=msmarcopsg \
  rank.searcher.name=bm25staticrob04yang19 \
  rank.optimize=recall_1000 reranker.name=TFKNRM reranker.trainer.niters=2 optimize=P_20

Error: profane.exceptions.InvalidConfigError: received unknown config key: index

I think it cannot be used on msmarcopsg, since in the description it says benchmark: name = robust04.yang19, but I get the same error with robust04 as well.

Thank you very much for your help.

andrewyates commented 3 years ago

In this case you would remove rank.searcher.index.stemmer=porter, since the static searchers use a results file directly and don't need an index.

However, you're right that rank.searcher.name=bm25staticrob04yang19 is only compatible with robust04. To do it this way with msmarco, you would need to create a new static searcher yourself with a file containing first-stage retrieval results from msmarco. (For example, BM25 results on msmarco train, dev, and test.)

Tooba-ts1700550 commented 3 years ago

I removed the stemmer, still I get this error:

2021-04-14 11:28:30,658 - INFO - capreolus.task.rerank.train - Time to rank.search: 0.1716005802154541
Traceback (most recent call last):
  File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.7/dist-packages/capreolus/run.py", line 108, in <module>
    task_entry_function()
  File "/usr/local/lib/python3.7/dist-packages/capreolus/task/rerank.py", line 37, in traineval
    self.train()
  File "/usr/local/lib/python3.7/dist-packages/capreolus/task/rerank.py", line 48, in train
    rank_results = self.rank.evaluate()
  File "/usr/local/lib/python3.7/dist-packages/capreolus/task/rank.py", line 55, in evaluate
    self.get_results_path(), self.benchmark, primary_metric=self.config["optimize"], metrics=metrics
  File "/usr/local/lib/python3.7/dist-packages/capreolus/evaluator.py", line 155, in search_best_run
    dev_qrels = {qid: benchmark.qrels[qid] for qid in benchmark.non_nn_dev[fold_name]}
  File "/usr/local/lib/python3.7/dist-packages/capreolus/evaluator.py", line 155, in <dictcomp>
    dev_qrels = {qid: benchmark.qrels[qid] for qid in benchmark.non_nn_dev[fold_name]}
KeyError: '672'

How can I create a new static searcher? Is it only possible to create it using Capreolus or can it be created using some other code like Anserini? And where can I find the file for bm25staticrob04yang19 ? (for my reference)

andrewyates commented 3 years ago

Right, that error is because the bm25staticrob04yang19 file is only for robust04. To create a new one, you just need to edit anserini.py with the path to the new run file. The run file can be generated using Anserini or any other tool that creates a TREC-format run file.