stanford-oval / storm

An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
http://storm.genie.stanford.edu
MIT License
13.34k stars 1.22k forks source link

IndexError: list index out of range #82

Closed yaojianchao closed 3 months ago

yaojianchao commented 3 months ago

我在本地语料库(您提供的axciv_data.csv),大模型我使用vllm中兼容openai的接口替换原本的openai的接口,运行的参数是: image 但是一直报错: image 经过排查发现是knowledge_curation.py 172line中 searched_results: List[StormInformation] = self.retriever.retrieve(list(set(queries)), exclude_urls=[ground_truth_url])无法查到searched_resultssearched_results=[] 继续更深一步的排查,发现是rm.py 396line中,related_docs = self.qdrant.similarity_search_with_score(query, k=self.k)related_docs为空,请问这是因为什么。

tstanek390 commented 3 months ago

Traceback: File "/Users/admin/miniforge3/envs/storm/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 535, in _run_script exec(code, module.dict) File "/Users/admin/AI/storm/frontend/demo_light/storm.py", line 64, in main() File "/Users/admin/AI/storm/frontend/demo_light/storm.py", line 60, in main CreateNewArticle.create_new_article_page() File "/Users/admin/AI/storm/frontend/demo_light/pages_util/CreateNewArticle.py", line 78, in create_new_article_page st.session_state["runner"].run(topic=st.session_state["page3_topic"], do_research=False, File "/Users/admin/AI/storm/src/storm_wiki/engine.py", line 305, in run draft_article = self.run_article_generation_module(outline=outline, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/admin/AI/storm/src/interface.py", line 376, in wrapper result = func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/Users/admin/AI/storm/src/storm_wiki/engine.py", line 196, in run_article_generation_module draft_article = self.storm_article_generation.generate_article( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/admin/AI/storm/src/storm_wiki/modules/article_generation.py", line 56, in generate_article information_table.prepare_table_for_retrieval() File "/Users/admin/AI/storm/src/storm_wiki/modules/storm_dataclass.py", line 161, in prepare_table_for_retrieval self.encoded_snippets = self.encoder.encode(self.collected_snippets, show_progress_bar=False) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/admin/miniforge3/envs/storm/lib/python3.11/site-packages/sentence_transformers/SentenceTransformer.py", line 565, in encode if all_embeddings[0].dtype == torch.bfloat16:

AMMAS1 commented 3 months ago

I'm wondering if what's causing the error is that you are initializing an empty "./vector_store" without filling it with the documents in the csv file. Please try adding "--update-vector-store" as one of the arguments. This will tell the model to add the documents in the csv file to the offline vector store.

tstanek390 commented 3 months ago

I'm getting this error either running the GUI with streamlit and in CLI. Tried to remove and reinstall Pytorch, Sentence Transformers etc., but nothing works :(

Yucheng-Jiang commented 3 months ago

@tstanek390 could you open another issue if it’s not the same one as this issue?

tstanek390 commented 3 months ago

It is the same error as this one - with Sentence Transformers. Now I'm unable to run it using any of the options (local, GTP, Claude).

Yucheng-Jiang commented 3 months ago

Did you check if it’s cause by rm.py line 396 returns empty result? And check @AMMAS1 response above. Does that solve your problem?

tstanek390 commented 3 months ago

I tried running it with --update-vector-store argument with no help. How would i know if rm.py return empty result? EDIT. The only option which is actually working for me is with this command : python examples/run_storm_wiki_gpt_with_VectorRM.py \ --output-dir /Users/admin/AI/storm \ --vector-db-mode offline \ --offline-vector-db-dir /Users/admin/AI/storm \ --update-vector-store \ --csv-file-path /Users/admin/Downloads/polished_literature_articles_detailed.csv \ --do-research \ --do-generate-outline \ --do-generate-article \ --do-polish-article

AMMAS1 commented 3 months ago

I tried running it with --update-vector-store argument with no help. How would i know if rm.py return empty result? EDIT. The only option which is actually working for me is with this command : python examples/run_storm_wiki_gpt_with_VectorRM.py --output-dir /Users/admin/AI/storm --vector-db-mode offline --offline-vector-db-dir /Users/admin/AI/storm --update-vector-store --csv-file-path /Users/admin/Downloads/polished_literature_articles_detailed.csv --do-research --do-generate-outline --do-generate-article --do-polish-article

Awesome! Glad to know it worked in the end. Would you mind sharing your first command (the one with "--update-vector-store") that didn't work so we can analyze where things went wrong?

For your question, if you're using your own data (as in using VectorRM class from rm.py), rm.py should not return an empty result as long as the vector store is not empty and search_top_k parameter is not 0, which is not the default.

In order to check if your vector store has data, you can call VectorRM.get_vector_count(), and it should return the number of chunks (your documents after chunking them) in the vector store. If that is zero, then rm.py would return an empty list.

Hope that helps!

tstanek390 commented 3 months ago

I would like to use GUI tho. But getting the same error here too:

(storm) admin@teodor--macbook-pro demo_light % streamlit run storm.py

You can now view your Streamlit app in your browser.

Local URL: http://localhost:8501 Network URL: http://172.20.10.4:8501

root : ERROR : Error occurs when searching query What is a sentinel node biopsy?: 'hits' root : ERROR : Error occurs when searching query How is sentinel node biopsy used in colon cancer?: 'hits' root : ERROR : Error occurs when searching query How does sentinel node biopsy help in the treatment of colon cancer?: 'hits' root : ERROR : Error occurs when searching query What is sentinel node biopsy in colon cancer?: 'hits' root : ERROR : Error occurs when searching query What is a sentinel node biopsy?: 'hits' root : ERROR : Error occurs when searching query What is sentinel node biopsy?: 'hits' root : ERROR : Error occurs when searching query How does a sentinel node biopsy work?: 'hits' root : ERROR : Error occurs when searching query How is sentinel node biopsy used in colon cancer treatment?: 'hits' root : ERROR : Error occurs when searching query What is the purpose of sentinel node biopsy in colon cancer treatment?: 'hits' root : ERROR : Error occurs when searching query What is the purpose of a sentinel node biopsy?: 'hits' root : ERROR : Error occurs when searching query What is the significance of a sentinel node biopsy in cancer treatment?: 'hits' root : ERROR : Error occurs when searching query Why is sentinel node biopsy important in colon cancer?: 'hits' root : ERROR : Error occurs when searching query Sentinel node biopsy in colon cancer: 'hits' root : ERROR : Error occurs when searching query Definition of sentinel node biopsy: 'hits' root : ERROR : Error occurs when searching query sentinel lymph node identification: 'hits' root : ERROR : Error occurs when searching query What is a sentinel node biopsy?: 'hits' root : ERROR : Error occurs when searching query What is a sentinel node biopsy?: 'hits' root : ERROR : Error occurs when searching query Sentinel node biopsy in colon cancer: 'hits' root : ERROR : Error occurs when searching query sentinel node biopsy colon cancer: 'hits' root : ERROR : Error occurs when searching query How is a sentinel node biopsy performed?: 'hits' root : ERROR : Error occurs when searching query Purpose of sentinel node biopsy in medical practice: 'hits' root : ERROR : Error occurs when searching query colon cancer surgery: 'hits' root : ERROR : Error occurs when searching query What is sentinel node biopsy?: 'hits' root : ERROR : Error occurs when searching query What is the purpose of a sentinel node biopsy?: 'hits' root : ERROR : Error occurs when searching query Explanation of sentinel node biopsy: 'hits' root : ERROR : Error occurs when searching query Sentinel node biopsy in colon cancer: 'hits' root : ERROR : Error occurs when searching query What is a sentinel node biopsy and how is it used in colon cancer diagnosis?: 'hits' root : ERROR : Error occurs when searching query Comparative studies of sentinel node biopsy in colon cancer and other cancers: 'hits' root : ERROR : Error occurs when searching query Definition of sentinel node biopsy: 'hits' root : ERROR : Error occurs when searching query Benefits of sentinel node biopsy in colon cancer: 'hits' root : ERROR : Error occurs when searching query What are the benefits and risks of a sentinel node biopsy compared to other types of biopsies for colon cancer?: 'hits' root : ERROR : Error occurs when searching query Sentinel node biopsy in colon cancer: 'hits' root : ERROR : Error occurs when searching query What is a sentinel node biopsy?: 'hits' root : ERROR : Error occurs when searching query Procedure for sentinel node biopsy in colon cancer: 'hits' root : ERROR : Error occurs when searching query How is a sentinel node biopsy performed in patients with colon cancer?: 'hits' root : ERROR : Error occurs when searching query Current evidence supporting use of sentinel node biopsy in colon cancer: 'hits' knowledge_storm.interface : INFO : run_knowledge_curation_module executed in 11.3001 seconds knowledge_storm.interface : INFO : run_outline_generation_module executed in 12.6983 seconds sentence_transformers.SentenceTransformer : INFO : Use pytorch device_name: mps sentence_transformers.SentenceTransformer : INFO : Load pretrained SentenceTransformer: paraphrase-MiniLM-L6-v2 2024-07-18 11:57:34.601 Uncaught app exception Traceback (most recent call last): File "/Users/admin/miniforge3/envs/storm/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 535, in _run_script exec(code, module.dict) File "/Users/admin/AI/storm/frontend/demo_light/storm.py", line 60, in main() File "/Users/admin/AI/storm/frontend/demo_light/storm.py", line 56, in main CreateNewArticle.create_new_article_page() File "/Users/admin/AI/storm/frontend/demo_light/pages_util/CreateNewArticle.py", line 78, in create_new_article_page st.session_state["runner"].run(topic=st.session_state["page3_topic"], do_research=False, File "/Users/admin/miniforge3/envs/storm/lib/python3.11/site-packages/knowledge_storm/storm_wiki/engine.py", line 315, in run draft_article = self.run_article_generation_module(outline=outline, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/admin/miniforge3/envs/storm/lib/python3.11/site-packages/knowledge_storm/interface.py", line 376, in wrapper result = func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/Users/admin/miniforge3/envs/storm/lib/python3.11/site-packages/knowledge_storm/storm_wiki/engine.py", line 197, in run_article_generation_module draft_article = self.storm_article_generation.generate_article( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/admin/miniforge3/envs/storm/lib/python3.11/site-packages/knowledge_storm/storm_wiki/modules/article_generation.py", line 57, in generate_article information_table.prepare_table_for_retrieval() File "/Users/admin/miniforge3/envs/storm/lib/python3.11/site-packages/knowledge_storm/storm_wiki/modules/storm_dataclass.py", line 162, in prepare_table_for_retrieval self.encoded_snippets = self.encoder.encode(self.collected_snippets, show_progress_bar=False) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/admin/miniforge3/envs/storm/lib/python3.11/site-packages/sentence_transformers/SentenceTransformer.py", line 565, in encode if all_embeddings[0].dtype == torch.bfloat16:


IndexError: list index out of range
yaojianchao commented 3 months ago

I'm wondering if what's causing the error is that you are initializing an empty "./vector_store" without filling it with the documents in the csv file. Please try adding "--update-vector-store" as one of the arguments. This will tell the model to add the documents in the csv file to the offline vector store. # @AMMAS1 谢谢您,如您所说,解决了这个问题