langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
84.76k stars 13.09k forks source link

Llama3 context window is 8k but Langchain with Ollama shows "Token indices sequence length is longer than the specified maximum sequence length for this model (1916 > 1024). " #20967

Open joleeson opened 2 weeks ago

joleeson commented 2 weeks ago

Checked other resources

Example Code

from langchain_community.llms import Ollama
llm = Ollama(model=params['model'],num_ctx=2048)#, num_predict=1100)   

with open("./prompt.txt", encoding="utf-8") as fd:
    doc_text=fd.read()

result = llm(doc_text)
print(result)

''' Setup prompt chains '''
from langchain.chains.mapreduce import MapReduceChain
from langchain_text_splitters import CharacterTextSplitter, RecursiveCharacterTextSplitter
from langchain.chains import ReduceDocumentsChain, MapReduceDocumentsChain

# Map
map_prompt = PromptTemplate.from_template(params['map_template'])
map_chain = LLMChain(llm=llm, prompt=map_prompt)

# Reduce
reduce_prompt = PromptTemplate.from_template(params['reduce_template'])

# Run chain
reduce_chain = LLMChain(llm=llm, prompt=reduce_prompt)

# Takes a list of documents, combines them into a single string, and passes this to an LLMChain
combine_documents_chain = StuffDocumentsChain(
    llm_chain=reduce_chain, document_variable_name="doc_summaries"
)

# Combines and iteravely reduces the mapped documents
reduce_documents_chain = ReduceDocumentsChain(
    # This is final chain that is called.
    combine_documents_chain=combine_documents_chain,
    # If documents exceed context for `StuffDocumentsChain`
    collapse_documents_chain=combine_documents_chain,
    # The maximum number of tokens to group documents into.
    token_max=params['reduce_token_max'],
)

# Combining documents by mapping a chain over them, then combining results
map_reduce_chain = MapReduceDocumentsChain(
    # Map chain
    llm_chain=map_chain,
    # Reduce chain
    reduce_documents_chain=reduce_documents_chain,
    # The variable name in the llm_chain to put the documents in
    document_variable_name="docs",
    # Return the results of the map steps in the output
    return_intermediate_steps=True,
)

print("Running map reduce summarisation...")
result = map_reduce_chain.invoke(docs)

Error Message and Stack Trace (if applicable)

Running map reduce summarisation... Token indices sequence length is longer than the specified maximum sequence length for this model (1993 > 1024). Running this sequence through the model will result in indexing errors

Description

I am running MapReduceDocumentsChain with Ollama and the "llama3" model.

Llama3 has a context window of 8k tokens. Ollama is given the argument "num_ctx = 4096".

However, the message produced indicates that the maximum sequence length being accepted is 1024. This must be happening with the ReduceDocumentsChain. Why is this the case? Any solutions? Thank you.

System Info

System Information

OS: Windows OS Version: 10.0.19045 Python Version: 3.12.2 | packaged by Anaconda, Inc. | (main, Feb 27 2024, 17:28:07) [MSC v.1916 64 bit (AMD64)]

Package Information

langchain_core: 0.1.46 langchain: 0.1.16 langchain_community: 0.0.34 langsmith: 0.1.43 langchain_openai: 0.1.2 langchain_text_splitters: 0.0.1