use_quick_index_mode False
reader_mode adobe
Using reader <kotaemon.loaders.adobe_loader.AdobeReader object at 0x15ff5dc30>
Got 0 page thumbnails
Adding documents to doc store
Getting embeddings for 66 nodes
Adding embeddings to vector store
indexing step took 6.741843223571777
chating step output
User-id: 1, can see public conversations: True
Session reasoning type None
Session LLM None
Reasoning class <class 'ktem.reasoning.simple.FullQAPipeline'>
Reasoning state {'app': {'regen': False}, 'pipeline': {}}
Thinking ...
Retrievers [DocumentRetrievalPipeline(DS=<kotaemon.storages.docstores.lancedb.LanceDBDocumentStore object at 0x1616b26e0>, FSPath=PosixPath('/Users/zhangcheng/code/python/kotaemon/ktem_app_data/user_data/files/index_1'), Index=<class 'ktem.index.file.index.IndexTable'>, Source=<class 'ktem.index.file.index.Source'>, VS=<kotaemon.storages.vectorstores.chroma.ChromaVectorStore object at 0x1616b2710>, get_extra_table=True, llm_scorer=LLMTrulensScoring(concurrent=True, normalize=10, prompt_template=<kotaemon.llms.prompts.template.PromptTemplate object at 0x1722b8700>, system_prompt_template=<kotaemon.llms.prompts.template.PromptTemplate object at 0x1722bbdf0>, top_k=3, user_prompt_template=<kotaemon.llms.prompts.template.PromptTemplate object at 0x1722b8730>), mmr=True, rerankers=[CohereReranking(cohere_api_key='', model_name='rerank-multilingual-v2.0')], retrieval_mode='hybrid', top_k=10, userid=1), GraphRAGRetrieverPipeline(DS=<theflow.base.unset object at 0x1055aee90>, FSPath=<theflow.base.unset object at 0x1055aee90>, Index=<class 'ktem.index.file.index.IndexTable'>, Source=<theflow.base.unset object at 0x1055aee90>, VS=<theflow.base.unset_ object at 0x1055aee90>, file_ids=[], userid=<theflow.base.unset object at 0x1055aee90>)]
searching in doc_ids ['8e51f681-7544-4979-95c8-e423667a1107']
retrieval_kwargs: dict_keys(['do_extend', 'scope', 'filters', 'mode', 'mmr_threshold'])
Got 0 from vectorstore
Got 0 from docstore
Cohere API key not found. Skipping rerankings.
Got raw 0 retrieved documents
thumbnail docs 0 non-thumbnail docs 0 raw-thumbnail docs 0
retrieval step took 1.1629290580749512
Got 0 retrieved documents
len (original) 0
Got 0 images
Trying LLM streaming
Got 0 cited docs
Reproduction steps
1. Go to '...'
2. Click on '....'
3. Scroll down to '....'
4. See error
Screenshots
![DESCRIPTION](LINK.png)
Logs
No response
Browsers
Chrome
OS
MacOS
Additional information
The same operational steps allow the content to be searched correctly using PDFThumbnailReader, but when indexing the file with AdobeReader, no content can be retrieved. Any suggestions?
Description
indexing step output
use_quick_index_mode False reader_mode adobe Using reader <kotaemon.loaders.adobe_loader.AdobeReader object at 0x15ff5dc30> Got 0 page thumbnails Adding documents to doc store Getting embeddings for 66 nodes Adding embeddings to vector store indexing step took 6.741843223571777
chating step output
User-id: 1, can see public conversations: True Session reasoning type None Session LLM None Reasoning class <class 'ktem.reasoning.simple.FullQAPipeline'> Reasoning state {'app': {'regen': False}, 'pipeline': {}} Thinking ... Retrievers [DocumentRetrievalPipeline(DS=<kotaemon.storages.docstores.lancedb.LanceDBDocumentStore object at 0x1616b26e0>, FSPath=PosixPath('/Users/zhangcheng/code/python/kotaemon/ktem_app_data/user_data/files/index_1'), Index=<class 'ktem.index.file.index.IndexTable'>, Source=<class 'ktem.index.file.index.Source'>, VS=<kotaemon.storages.vectorstores.chroma.ChromaVectorStore object at 0x1616b2710>, get_extra_table=True, llm_scorer=LLMTrulensScoring(concurrent=True, normalize=10, prompt_template=<kotaemon.llms.prompts.template.PromptTemplate object at 0x1722b8700>, system_prompt_template=<kotaemon.llms.prompts.template.PromptTemplate object at 0x1722bbdf0>, top_k=3, user_prompt_template=<kotaemon.llms.prompts.template.PromptTemplate object at 0x1722b8730>), mmr=True, rerankers=[CohereReranking(cohere_api_key='', model_name='rerank-multilingual-v2.0')], retrieval_mode='hybrid', top_k=10, userid=1), GraphRAGRetrieverPipeline(DS=<theflow.base.unset object at 0x1055aee90>, FSPath=<theflow.base.unset object at 0x1055aee90>, Index=<class 'ktem.index.file.index.IndexTable'>, Source=<theflow.base.unset object at 0x1055aee90>, VS=<theflow.base.unset_ object at 0x1055aee90>, file_ids=[], userid=<theflow.base.unset object at 0x1055aee90>)]
searching in doc_ids ['8e51f681-7544-4979-95c8-e423667a1107']
retrieval_kwargs: dict_keys(['do_extend', 'scope', 'filters', 'mode', 'mmr_threshold'])
Got 0 from vectorstore
Got 0 from docstore
Cohere API key not found. Skipping rerankings.
Got raw 0 retrieved documents
thumbnail docs 0 non-thumbnail docs 0 raw-thumbnail docs 0
retrieval step took 1.1629290580749512
Got 0 retrieved documents
len (original) 0
Got 0 images
Trying LLM streaming
Got 0 cited docs
Reproduction steps
Screenshots
Logs
No response
Browsers
Chrome
OS
MacOS
Additional information
The same operational steps allow the content to be searched correctly using PDFThumbnailReader, but when indexing the file with AdobeReader, no content can be retrieved. Any suggestions?