mmz-001 / knowledge_gpt

Accurate answers and instant citations for your documents.
https://knowledgegpt.streamlit.app/
MIT License
1.58k stars 729 forks source link

Error message #2

Closed abdcef closed 1 year ago

abdcef commented 1 year ago

I get this error message while trying to parse the pdf:

IndexError: This app has encountered an error. The original error message is redacted to prevent data leaks. Full error details have been recorded in the logs (if you're on Streamlit Cloud, click on 'Manage app' in the lower right of your app).

mmz-001 commented 1 year ago

Can you confirm your PDF is parsable (i.e., the text can be copied)? KnowlegeGPT doesn't currently support scanned documents.

abdcef commented 1 year ago

Thanks, and yes they are parsable. I tried on a few of them and got the error…

mmz-001 commented 1 year ago

Here's the stack trace from the server logs:

Stack Trace ```bash Traceback (most recent call last): File "/home/appuser/venv/lib/python3.10/site-packages/streamlit/runtime/legacy_caching/caching.py", line 593, in get_or_create_cached_value return_value = _read_from_cache( File "/home/appuser/venv/lib/python3.10/site-packages/streamlit/runtime/legacy_caching/caching.py", line 350, in _read_from_cache raise e File "/home/appuser/venv/lib/python3.10/site-packages/streamlit/runtime/legacy_caching/caching.py", line 335, in _read_from_cache return _read_from_mem_cache( File "/home/appuser/venv/lib/python3.10/site-packages/streamlit/runtime/legacy_caching/caching.py", line 252, in _read_from_mem_cache raise CacheKeyNotFoundError("Key not found in mem cache") streamlit.runtime.legacy_caching.caching.CacheKeyNotFoundError: Key not found in mem cache During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/appuser/venv/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 565, in _run_script exec(code, module.__dict__) File "/app/knowledge_gpt/knowledge_gpt/main.py", line 79, in index = embed_docs(text) File "/home/appuser/venv/lib/python3.10/site-packages/streamlit/runtime/legacy_caching/caching.py", line 627, in wrapped_func return get_or_create_cached_value() File "/home/appuser/venv/lib/python3.10/site-packages/streamlit/runtime/legacy_caching/caching.py", line 611, in get_or_create_cached_value return_value = non_optional_func(*args, **kwargs) File "/app/knowledge_gpt/knowledge_gpt/utils.py", line 97, in embed_docs index = FAISS.from_documents(docs, embeddings) File "/home/appuser/venv/lib/python3.10/site-packages/langchain/vectorstores/base.py", line 62, in from_documents return cls.from_texts(texts, embedding, metadatas=metadatas, **kwargs) File "/home/appuser/venv/lib/python3.10/site-packages/langchain/vectorstores/faiss.py", line 192, in from_texts index = faiss.IndexFlatL2(len(embeddings[0])) IndexError: list index out of range ```

Can you link an example document so that I can repro this?

abdcef commented 1 year ago

It seems to work now 🤞 I uploaded a file and waited some minutes for the indexing to finish but it seems the file got dropped off. Will try again to see if the indexing completes.

mmz-001 commented 1 year ago

Closing this for now. If anything pops up, let me know.