Closed Justinius closed 1 week ago
I've modified one of the examples to use Ollama with the persona examples using an offline database as the vector store.
Could you provide an example for your failure case including your input and the output files you get? Based on the description, it's hard to tell whether it's because the LM you use is not following instructions successfully or because of some bug.
We just updated the bug report issue template. Please follow the template to help us better assist debugging. Appreciated!
template:
Describe the bug A clear and concise description of what the bug is.
To Reproduce Report following things
Screenshots If applicable, add screenshots to help explain your problem.
Environment:
Describe the bug Occasionally the process does not generate an article outline and throws an error. Given the reasoning behind storm sometime this greatly reduces the quality of the generated article, in this example its not much of a problem given the toy nature of the data.
To Reproduce Attached is a zip folder containing the vector store used as well as the generated documents. I created a new template by merging the Ollama Wiki Example for the personas and VectorRM example for using a local DB so STORM would use local PDFs and a locally hosted LLM. The LLM used was llama3.1:8b. I created a separate script that just used pyPDF to pull text from 6 PDFs I downloaded from Arxiv about PSO.
Input topic name "Provide and overview of particle swarm optimization and possible improvements" Had typo put "and" instead of "an" in the topic but LLM understood the request anyway.
Output files generated Includes Vector DB psoTest.zip
Screenshots If applicable, add screenshots to help explain your problem.
Environment Windows 10 - 2009.19045 RAM: 32GB CPU: 6 Core 3.6GHz Intel Xeon W-2133 (HT) GPU: P4000
Cloned GITHUB repo yesterday so should be pretty up to date on the source
I should note - I replaced torch with a version compiled for my architecture and to use CUDA. I also increased the timeout in ollama given that even with my GPU it was taking awhile.
Command Prompt Input:
C:\pythonEnv\storm>python -m stormLocal --output-dir "C:\pythonEnv\storm\psoTest" --vector-db-mode "offline" --offline-vector-db-dir "c:\pythonEnv\storm\psoTest" --do-research --do-generate-outline --do-generate-article --do-polish-article
stormLocal is my version of the examples where I basically merged the Ollama and VectorRM examples.
Thanks for reporting! I looked into llm_call_history.jsonl
and find the outline generated does not align with specified format, thus no valid outline is displayed. Some pointers
chatcmpl-da39a3ee5e6b4b0d3255bfef95601890afd80709
in llm_call_history.jsonl
and find the generated outline starts with integer followed by a dot#
as valid outline section name.Suggestion to solve this issue
Hope it helps.
[Reminder] Will close this issue if have no further questions by end of this week.
From time to time I receive an error message: "root : ERROR : No outline for {topic}. Will directly search with the topic."
I've modified one of the examples to use Ollama with the persona examples using an offline database as the vector store. I wrote an adapter to chunk other data to the qdrant vector store instead of having it read to the CSV first.
I rebased off of the github sources this morning and I'm having the same problem. Previously I thought it might have been an issue with the file path including the topic but I'm not sure that is the root cause, because I can delete the topic folder and rerun and on some instances it will complete the outline.
The jsonl files for raw search and conversation are there, but it doesn't seem to completely generate the outline text file for some reason. Any help you could provide would be greatly appreciated.