For validation purposes I run the indexing on the TestFolder provided in the repo, but after that I wanted to change the folder I keep documents for indexing running the command python index.py "C:\sampleData\Sample documents"
However, observing the log provided below, it looks that the TestFolder indexed initially hasnt been changed. Could you please help on this?
I am using Python 3.11.7.
Thx,
Dejan
C:\Users\Dejan\AppData\Roaming\Python\Python311\site-packages\langchain_core_api\deprecation.py:119: LangChainDeprecationWarning: The class HuggingFaceEmbeddings was deprecated in LangChain 0.2.2 and will be removed in 0.3.0. An updated version of the class exists in the langchain-huggingface package and should be used instead. To use it run pip install -U langchain-huggingface and import as from langchain_huggingface import HuggingFaceEmbeddings.
warn_deprecated(
C:\Users\Dejan\AppData\Roaming\Python\Python311\site-packages\transformers\utils\generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
_torch_pytree._register_pytree_node(
C:\Users\Dejan\AppData\Roaming\Python\Python311\site-packages\transformers\utils\generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
_torch_pytree._register_pytree_node(
C:\Users\Dejan\AppData\Roaming\Python\Python311\site-packages\huggingface_hub\file_download.py:1132: FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True.
warnings.warn(
Indexing...
indexing TestFolder\2201.01647v4.pdf
indexing TestFolder\D22 How_do_you_know_thatTeaching_Generative_Language_Models_to_Reference_Answers_to_Biomedical_Questions.pdf
indexing TestFolder\OSS_LLMs.pdf
indexing TestFolder\Review_Serbian_NLP.pdf
indexing TestFolder\subfolder\2305.04928v4.pdf
indexing TestFolder\subfolder\Algorithmic trading - MUTIS mag.docx
indexing TestFolder\subfolder\International Children.docx
indexing TestFolder\subfolder\SeraphimdroidEmail.txt
['TestFolder\2201.01647v4.pdf', 'TestFolder\D22 How_do_you_know_thatTeaching_Generative_Language_Models_to_Reference_Answers_to_Biomedical_Questions.pdf', 'TestFolder\OSS_LLMs.pdf', 'TestFolder\Review_Serbian_NLP.pdf', 'TestFolder\subfolder\2305.04928v4.pdf', 'TestFolder\subfolder\Algorithmic trading - MUTIS mag.docx', 'TestFolder\subfolder\International Children.docx', 'TestFolder\subfolder\SeraphimdroidEmail.txt']
Finished indexing!
Hello,
For validation purposes I run the indexing on the TestFolder provided in the repo, but after that I wanted to change the folder I keep documents for indexing running the command python index.py "C:\sampleData\Sample documents" However, observing the log provided below, it looks that the TestFolder indexed initially hasnt been changed. Could you please help on this?
I am using Python 3.11.7.
Thx, Dejan
C:\Users\Dejan\AppData\Roaming\Python\Python311\site-packages\langchain_core_api\deprecation.py:119: LangChainDeprecationWarning: The class
HuggingFaceEmbeddings
was deprecated in LangChain 0.2.2 and will be removed in 0.3.0. An updated version of the class exists in the langchain-huggingface package and should be used instead. To use it runpip install -U langchain-huggingface
and import asfrom langchain_huggingface import HuggingFaceEmbeddings
. warn_deprecated( C:\Users\Dejan\AppData\Roaming\Python\Python311\site-packages\transformers\utils\generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. _torch_pytree._register_pytree_node( C:\Users\Dejan\AppData\Roaming\Python\Python311\site-packages\transformers\utils\generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. _torch_pytree._register_pytree_node( C:\Users\Dejan\AppData\Roaming\Python\Python311\site-packages\huggingface_hub\file_download.py:1132: FutureWarning:resume_download
is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, useforce_download=True
. warnings.warn( Indexing... indexing TestFolder\2201.01647v4.pdf indexing TestFolder\D22 How_do_you_know_thatTeaching_Generative_Language_Models_to_Reference_Answers_to_Biomedical_Questions.pdf indexing TestFolder\OSS_LLMs.pdf indexing TestFolder\Review_Serbian_NLP.pdf indexing TestFolder\subfolder\2305.04928v4.pdf indexing TestFolder\subfolder\Algorithmic trading - MUTIS mag.docx indexing TestFolder\subfolder\International Children.docx indexing TestFolder\subfolder\SeraphimdroidEmail.txt ['TestFolder\2201.01647v4.pdf', 'TestFolder\D22 How_do_you_know_thatTeaching_Generative_Language_Models_to_Reference_Answers_to_Biomedical_Questions.pdf', 'TestFolder\OSS_LLMs.pdf', 'TestFolder\Review_Serbian_NLP.pdf', 'TestFolder\subfolder\2305.04928v4.pdf', 'TestFolder\subfolder\Algorithmic trading - MUTIS mag.docx', 'TestFolder\subfolder\International Children.docx', 'TestFolder\subfolder\SeraphimdroidEmail.txt'] Finished indexing!