ValueError: not enough values to unpack (expected 2, got 1)

System Info

File "C:\Users\vivek\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\flask\app.py", line 2190, in wsgi_app response = self.full_dispatch_request() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\vivek\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\flask\app.py", line 1486, in full_dispatch_request rv = self.handle_user_exception(e) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\vivek\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\flask\app.py", line 1484, in full_dispatch_request rv = self.dispatch_request() ^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\vivek\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\flask\app.py", line 1469, in dispatch_request return self.ensure_sync(self.view_functions[rule.endpoint])(view_args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "c:\Users\vivek\OneDrive\Desktop\SOPPOC\flask_app.py", line 43, in chat return RCXStreakanswer(input) ^^^^^^^^^^^^^^^^^^^^^^ File "c:\Users\vivek\OneDrive\Desktop\SOPPOC\RCX_Streak.py", line 53, in RCXStreakanswer openAIEmbedd = FAISS.from_documents(texts, embeddings) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\vivek\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\langchain\schema\vectorstore.py", line 510, in from_documents return cls.from_texts(texts, embedding, metadatas=metadatas, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\vivek\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\langchain\vectorstores\faiss.py", line 911, in from_texts embeddings = embedding.embed_documents(texts) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\vivek\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\langchain\embeddings\openai.py", line 549, in embed_documents return self._get_len_safe_embeddings(texts, engine=engine) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\vivek\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\langchain\embeddings\openai.py", line 392, in _get_len_safe_embeddings encoding = tiktoken.encoding_for_model(model_name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\vivek\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\tiktoken\model.py", line 97, in encoding_for_model return get_encoding(encoding_name_for_model(model_name)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\vivek\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\tiktoken\registry.py", line 73, in get_encoding enc = Encoding(**constructor()) ^^^^^^^^^^^^^ File "C:\Users\vivek\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\tiktoken_ext\openai_public.py", line 64, in cl100k_base mergeable_ranks = load_tiktoken_bpe( ^^^^^^^^^^^^^^^^^^ File "C:\Users\vivek\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\tiktoken\load.py", line 124, in load_tiktoken_bpe return { ^ File "C:\Users\vivek\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\tiktoken\load.py", line 126, in for token, rank in (line.split() for line in contents.splitlines() if line) ^^^^^^^^^^^ ValueError: not enough values to unpack (expected 2, got 1)

Who can help?

No response

Information

[ ] The official example notebooks/scripts
[ ] My own modified scripts

Related Components

[ ] LLMs/Chat Models
[X] Embedding Models
[ ] Prompts / Prompt Templates / Prompt Selectors
[ ] Output Parsers
[ ] Document Loaders
[X] Vector Stores / Retrievers
[ ] Memory
[ ] Agents / Agent Executors
[ ] Tools / Toolkits
[ ] Chains
[ ] Callbacks/Tracing
[ ] Async

Reproduction

loader = Docx2txtLoader(doc_path) documents.extend(loader.load()) content = documents text_splitter = RecursiveCharacterTextSplitter( chunk_size = 100, chunk_overlap = 20, separators=["\n\n", "\n", "."] ) texts = text_splitter.split_documents(content) print(texts) print() embeddings = OpenAIEmbeddings() openAIEmbedd = FAISS.from_documents(texts, embeddings) print(openAIEmbedd) prompt_template = """Given the following context and a question, generate an answer. Based on user input extract only data for the given question from context. \
CONTEXT: {context}

QUESTION: {question}"""

PROMPT = PromptTemplate(
    template=prompt_template, input_variables=["context", "question"]
)
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)
retriever_openai = openAIEmbedd.as_retriever(search_kwargs={"k": 2})
print(retriever_openai)
chain = RetrievalQA.from_chain_type(llm=llm,
                                    chain_type="stuff",
                                    retriever=retriever_openai,
                                    return_source_documents=True,
                                    chain_type_kwargs={"prompt": PROMPT})

ans=chain(user_message)
output= ans['result']
return output

Expected behavior

should return answer

🤖

Hello @Vivek-Kawathalkar! 👋

I'm Dosu, your friendly bot helper. I'm here to assist you with bug squashing, answering questions, and guiding you on your journey to becoming an awesome contributor! While we wait for a human maintainer, feel free to reach out to me. I'm all ears (or circuits) for any issues you might have. Let's decode this together! 🚀

Based on the traceback you provided, the error seems to be originating from the tiktoken library, which is used by the OpenAIEmbeddings class in LangChain. Specifically, the error is raised when the load_tiktoken_bpe function in tiktoken tries to unpack a line of text into a token and a rank, but the line of text does not contain enough values.

This error typically occurs when a sequence is being unpacked into more variables than the sequence contains. In this case, it seems that the load_tiktoken_bpe function is expecting each line of text to contain a token and a rank, but one or more lines only contain a token.

Here are a few potential solutions based on similar issues that have been solved in the LangChain repository:

The error message "not enough values to unpack (expected 2, got 1)" typically occurs when you try to unpack a tuple or list of length 1 into two variables. One potential source of the error could be the docsearch_db.similarity_search function. This function returns a list of Document objects, where each Document object contains the text content of the document and its metadata. If any function is trying to unpack these Document objects into more than two variables, that could be causing the error. To help you further, could you please provide more information about the error? Specifically, it would be helpful to know the exact line of code where the error is occurring. You can find this information in the traceback that Python prints when the error occurs. Also, could you please provide more information about the local_chain.predict function? Knowing what this function expects as input and what it returns could also help identify the source of the error. Source

Reinstall the required libraries with the specific versions using the command:

pip install langchain==0.0.125 openai==0.27.2 chromadb==0.3.14 pypdf==3.7.0 tiktoken==0.3.3 gradio==3.23

Source

The user resolved the issue by supplying embeddings with the 'Davinci' model. Source

If these solutions do not resolve your issue, please provide more information about the texts you are trying to embed and the version of tiktoken you are using. This will help us better understand the issue and provide a more accurate solution.

For more information about the FAISS.from_documents function and how it interacts with OpenAIEmbeddings, you can refer to the LangChain documentation.

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

langchain-ai / langchain