langchain-ai / chat-langchain
MIT License
5k stars 1.16k forks source link

ingest, embedding, faiss error #106

Open lewsutt29 opened 11 months ago

lewsutt29 commented 11 months ago

$ ./ --2023-07-29 15:07:25-- Resolving (,, 2606:4700::6811:2152, ... Connecting to (||:443... connected. HTTP request sent, awaiting response... 302 Found Location: [following] --2023-07-29 15:07:25-- Resolving (,, 2606:4700::6811:2152, ... Connecting to (||:443... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] Saving to: ‘’ [ <=> ] 1.57K --.-KB/s in 0s

2023-07-29 15:07:25 (30.1 MB/s) - ‘’ saved [1612]

FINISHED --2023-07-29 15:07:25-- Total wall clock time: 0.3s Downloaded: 1 files, 1.6K in 0s (30.1 MB/s) /home/lichen/environments/langchain_documentation_chatbot/venv/lib/python3.8/site-packages/langchain/document_loaders/ GuessedAtParserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

The code that caused this warning is on line 48 of the file /home/lichen/environments/langchain_documentation_chatbot/venv/lib/python3.8/site-packages/langchain/document_loaders/ To get rid of this warning, pass the additional argument 'features="lxml"' to the BeautifulSoup constructor.

_ = BeautifulSoup( /home/lichen/environments/langchain_documentation_chatbot/venv/lib/python3.8/site-packages/langchain/document_loaders/ GuessedAtParserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

The code that caused this warning is on line 75 of the file /home/lichen/environments/langchain_documentation_chatbot/venv/lib/python3.8/site-packages/langchain/document_loaders/ To get rid of this warning, pass the additional argument 'features="lxml"' to the BeautifulSoup constructor.

soup = BeautifulSoup(data, self.bs_kwargs) Embeddings: client=<class 'openai.api_resources.embedding.Embedding'> model='text-embedding-ada-002' deployment='text-embedding-ada-002' openai_api_version='' openai_api_base='' openai_api_type='' openai_proxy='' embedding_ctx_length=8191 openai_api_key='...apikeyhere...' openai_organization='' allowed_special=set() disallowed_special='all' chunk_size=1000 max_retries=6 request_timeout=None headers=None tiktoken_model_name=None show_progress_bar=False model_kwargs={} Traceback (most recent call last): File "", line 34, in ingest_docs() File "", line 26, in ingest_docs vectorstore = FAISS.from_documents(documents, embeddings) File "/home/lichen/environments/langchain_documentation_chatbot/venv/lib/python3.8/site-packages/langchain/vectorstores/", line 413, in from_documents return cls.from_texts(texts, embedding, metadatas=metadatas, kwargs) File "/home/lichen/environments/langchain_documentation_chatbot/venv/lib/python3.8/site-packages/langchain/vectorstores/", line 578, in from_texts return cls.from( File "/home/lichen/environments/langchain_documentation_chatbot/venv/lib/python3.8/site-packages/langchain/vectorstores/", line 522, in from index = faiss.IndexFlatL2(len(embeddings[0])) IndexError: list index out of range

sharrajesh commented 11 months ago


ionescofung commented 11 months ago


ailyfeng commented 11 months ago

me too

tymrtn commented 11 months ago

me three

East196 commented 10 months ago

the website is null