castorini / ragnarok

Retrieval-Augmented Generation battle!
Apache License 2.0
42 stars 3 forks source link

No segments* file found in MMapDirectory #7

Closed allenlin316 closed 2 months ago

allenlin316 commented 3 months ago

I followed the instruction of rag24.md but got the error of running below command:

export ANSERINI_JAR=anserini-0.36.1-fatjar.jar
export OUTPUT_DIR="runs"
TOPICS=(rag24.raggy-dev rag24.researchy-dev)
for t in "${TOPICS[@]}"; do 
    java -cp $ANSERINI_JAR io.anserini.search.SearchCollection \
        -index msmarco-v2.1-doc-segmented \
        -topics $t \
        -output $OUTPUT_DIR/run.msmarco-v2.1-doc-segmented.bm25.${t}.txt \
        -threads 16 \
        -bm25 \
        -hits 100 \
        -outputRerankerRequests $OUTPUT_DIR/retrieve_results_msmarco-v2.1-doc-segmented.bm25.${t}_top100.jsonl &
done

error messages as indicated below:

no segments* file found in MMapDirectory@/root/.cache/pyserini/indexes/lucene-inverted.msmarco-v2.1-doc-segmented.20240418.4f9675.6ec4cd595c9fe1ad91b43eabb39a637c lockFactory=org.apache.lucene.store.NativeFSLockFactory@74287ea3: files: [_1f_Lucene99_0.doc, _1f_Lucene99_0.tmd, write.lock]

My OpenJDK is 21 and everything of the segments are existed in the same directory still don't know why this would happend Hope someone could help me! Thanks in advance!

lintool commented 3 months ago

Hi @allenlin316 - look in the index directory /root/.cache/pyserini/indexes/lucene-inverted.msmarco-v2.1-doc-segmented.20240418.4f9675.6ec4cd595c9fe1ad91b43eabb39a637c - do an ls... what does it show?

I suspect the download did not succeed...

jayavanth commented 2 months ago

was anyone able to fix this? I tried after downloading fresh twice and still getting this

lintool commented 2 months ago

@jayavanth do an ls on the directory and share the output please.

jayavanth commented 2 months ago

@jayavanth do an ls on the directory and share the output please.

ok will do in a bit. downloading again

jayavanth commented 2 months ago

ok I figured out the issue. It's storing in a different volume that has no space. Is there a way to change the storage location other than in ~/.cache?

lintool commented 2 months ago

There's an outstanding issue for exactly this feature request: https://github.com/castorini/anserini/issues/2322

Unfortunately, we haven't had time to work on it...

The solution I use is to symlink ~/.cache to another location.

Closing unless there's any further follow-up.