castorini / ragnarok

Retrieval-Augmented Generation battle!
Apache License 2.0
38 stars 2 forks source link

Cannot run the scripts #4

Closed LouisDo2108 closed 1 month ago

LouisDo2108 commented 2 months ago

Hi, I was playing around with the code and I could not run the scripts, the output is:

(ragnarok) $ python src/ragnarok/scripts/run_ragnarok.py  --model_path=command-r-plus  --topk=20 \
  --dataset=rag24.researchy-dev  --retrieval_method=bm25 --prompt_mode=cohere  \
  --context_size=8192 --max_output_tokens=1024 
Calling reranker API...
Loading candidates from retrieve_results/BM25/retrieve_results_rag24.researchy-dev_top20.jsonl.

Failed to load JSON file: retrieve_results/BM25/retrieve_results_rag24.researchy-dev_top100.jsonl
Traceback (most recent call last):
  File "/home/thuy0050/code/ragnarok/src/ragnarok/scripts/run_ragnarok.py", line 145, in <module>
    main(args)
  File "/home/thuy0050/code/ragnarok/src/ragnarok/scripts/run_ragnarok.py", line 54, in main
    _ = retrieve_and_generate(
  File "/home/thuy0050/code/ragnarok/src/ragnarok/retrieve_and_generate.py", line 119, in retrieve_and_generate
    requests = Retriever.from_dataset_with_prebuilt_index(
  File "/home/thuy0050/code/ragnarok/src/ragnarok/retrieve_and_rerank/retriever.py", line 111, in from_dataset_with_prebuilt_index
    return retriever.retrieve(k=k, cache_input_format=cache_input_format)
  File "/home/thuy0050/code/ragnarok/src/ragnarok/retrieve_and_rerank/retriever.py", line 168, in retrieve
    return retrieved_results
UnboundLocalError: local variable 'retrieved_results' referenced before assignment

Do I need to download any extra JSON files to run this script or is this just simply a bug? Thank you.

Best regards, Louis.

ronakice commented 2 months ago

Hey @LouisDo2108 - take a look here https://github.com/castorini/ragnarok/blob/main/docs/rag24.md

I think it should resolve this issue? More generally describes our workflow with Multi-Stage Ranking -> Ragnarok