run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
35.86k stars 5.09k forks source link

This model's maximum context length is 8192 tokens, however you requested 9113 tokensThis model's maximum context length is 8192 tokens, however you requested 9113 tokens, invalid_request_error #16344

Open arun13ak opened 1 day ago

arun13ak commented 1 day ago

Question Validation

Question

@dosu iam using LlamaParseJsonNodeParser, for parsing the documents,iam using 8196 context window model.how can i split the documents into chunks for example this is my code to store the text in document list,give an sample example of chunksing this documents value and pass to llm: documents = [] for _, page in enumerate(json_list): documents.append( Document( text=page.get("text"), metadata=page, ) ) node_parser = LlamaParseJsonNodeParser( llm=OpenAI(model="gpt-3.5-turbo"), num_workers=16, include_metadata=True ) nodes = node_parser.get_nodes_from_documents(documents)

dosubot[bot] commented 1 day ago

To handle the issue of exceeding the model's maximum context length of 8192 tokens, you can split the documents into smaller chunks before processing them with the LlamaParseJsonNodeParser. Here's an example of how you can achieve this using the SemanticChunker class for effective chunking:

from llama_index.core import Document
from llama_index.packs.node_parser_semantic_chunking.base import SemanticChunker
from llama_index.embeddings.openai import OpenAIEmbedding

# Initialize the SemanticChunker with the desired settings
semantic_chunker = SemanticChunker(
    buffer_size=1,  # Number of sentences to include in each chunk
    embed_model=OpenAIEmbedding(),  # Use OpenAI embedding model
    breakpoint_percentile_threshold=95.0  # Percentile threshold for chunking
)

# List of documents to be chunked
documents = []
for _, page in enumerate(json_list):
    documents.append(
        Document(
            text=page.get("text"),
            metadata=page,
        )
    )

# Parse the documents into chunks
chunks = []
for doc in documents:
    chunks.extend(semantic_chunker.split_text(doc.text))

# `chunks` now contains the chunked documents ready to be passed to the LLM

This code uses the SemanticChunker to split the documents into semantically meaningful chunks, considering sentence boundaries and semantic similarity, which can then be passed to the LLM [1].

To continue talking to Dosu, mention @dosu.

logan-markewich commented 16 hours ago

You'll probably need to run a second pass of chunking to ensure some maximum chunk size

from llama_index.core.node_parser import SentenceSpltter

splitter = SentenceSplitter(chunk_size=2048, chunk_overlap=128)
json_node_parser = LlamaParseJsonNodeParser(
  llm=OpenAI(model="gpt-3.5-turbo"), num_workers=16, include_metadata=True
)
nodes = json_node_parser(documents)
nodes = splitter(nodes)
logan-markewich commented 16 hours ago

8192 is the maximum size for openai embeddings

The json node parser does not enforce a maximum size on its own