nikolamilosevic86 / local-genAI-search

Local-GenAI-Search is a generative search engine based on Llama 3, langchain and qdrant that answers questions based on your local files
GNU General Public License v3.0
82 stars 30 forks source link

Cannot change document folder for indexing #2

Closed dejankocic closed 5 months ago

dejankocic commented 5 months ago

Hello,

For validation purposes I run the indexing on the TestFolder provided in the repo, but after that I wanted to change the folder I keep documents for indexing running the command python index.py "C:\sampleData\Sample documents" However, observing the log provided below, it looks that the TestFolder indexed initially hasnt been changed. Could you please help on this?

I am using Python 3.11.7.

Thx, Dejan

C:\Users\Dejan\AppData\Roaming\Python\Python311\site-packages\langchain_core_api\deprecation.py:119: LangChainDeprecationWarning: The class HuggingFaceEmbeddings was deprecated in LangChain 0.2.2 and will be removed in 0.3.0. An updated version of the class exists in the langchain-huggingface package and should be used instead. To use it run pip install -U langchain-huggingface and import as from langchain_huggingface import HuggingFaceEmbeddings. warn_deprecated( C:\Users\Dejan\AppData\Roaming\Python\Python311\site-packages\transformers\utils\generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. _torch_pytree._register_pytree_node( C:\Users\Dejan\AppData\Roaming\Python\Python311\site-packages\transformers\utils\generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. _torch_pytree._register_pytree_node( C:\Users\Dejan\AppData\Roaming\Python\Python311\site-packages\huggingface_hub\file_download.py:1132: FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True. warnings.warn( Indexing... indexing TestFolder\2201.01647v4.pdf indexing TestFolder\D22 How_do_you_know_thatTeaching_Generative_Language_Models_to_Reference_Answers_to_Biomedical_Questions.pdf indexing TestFolder\OSS_LLMs.pdf indexing TestFolder\Review_Serbian_NLP.pdf indexing TestFolder\subfolder\2305.04928v4.pdf indexing TestFolder\subfolder\Algorithmic trading - MUTIS mag.docx indexing TestFolder\subfolder\International Children.docx indexing TestFolder\subfolder\SeraphimdroidEmail.txt ['TestFolder\2201.01647v4.pdf', 'TestFolder\D22 How_do_you_know_thatTeaching_Generative_Language_Models_to_Reference_Answers_to_Biomedical_Questions.pdf', 'TestFolder\OSS_LLMs.pdf', 'TestFolder\Review_Serbian_NLP.pdf', 'TestFolder\subfolder\2305.04928v4.pdf', 'TestFolder\subfolder\Algorithmic trading - MUTIS mag.docx', 'TestFolder\subfolder\International Children.docx', 'TestFolder\subfolder\SeraphimdroidEmail.txt'] Finished indexing!

nikolamilosevic86 commented 5 months ago

You are right, there was hardcoded path left in index.py. It is now changed to use the parameter.

dejankocic commented 5 months ago

@nikolamilosevic86 thx.