Closed drhedri1 closed 1 year ago
Max input size should be an integer, and it should be set to something at least 100 more than your chunk size limit and num_output.
For a model like davinci, the Max input size is 4096 by default (and cannot be higher than this)
PLease help? from gpt_index import SimpleDirectoryReader, GPTSimpleVectorIndex, LLMPredictor, PromptHelper from langchain import OpenAI, VectorDBQA from langchain.document_loaders import DirectoryLoader import os from IPython.display import Markdown, display
def construct_index(directory_path, api_key): max_input_size = 4096 num_output = 100 max_chunk_overlap = 20 chunk_size_limit = 600
llm_predictor = LLMPredictor(
llm=OpenAI(
temperature=0.5,
model_name="text-davinci-003",
max_tokens=num_output,
openai_api_key=api_key,
)
)
prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap, chunk_size_limit=chunk_size_limit)
documents = SimpleDirectoryReader(directory_path).load_data()
index = GPTSimpleVectorIndex(
documents, llm_predictor=llm_predictor, prompt_helper=prompt_helper
)
index.save_to_disk("index.json")
return index
def ask_ai(): index = GPTSimpleVectorIndex.load_from_disk("index.json") while True: query = input("What's going on, how can I help you? ") response = index.query(query, response_mode="compact") display(Markdown(f'Response: {response.response}'))
os.environ["Open_api_key"] = input('paste your api key here and hit enter: ')
construct_index('/Users/domenicrhedrick/Desktop/GPT_bot/context_data', os.environ["Open_api_key"])
The output is building an index but not allowing me to ask questions : Output - INFO:gpt_index.token_counter.token_counter:> [build_index_from_documents] Total LLM token usage: 0 tokens INFO:gpt_index.token_counter.token_counter:> [build_index_from_documents] Total embedding token usage: 943 tokens
@drhedri1 You'll want to pass in the llm_predictor and prompt_helper again when loading from disk to keep your settings
index = GPTSimpleVectorIndex.load_from_disk(
"index.json", llm_predictor=llm_predictor, prompt_helper=prompt_helper
)
Met the same issue. Chunk size calculate error, it's will be a negative number when the document is long enough.
Seems index put all document instead of indexed text. @logan-markewich Any ideas?
Got it fixed, check you query str is too long.
BaseGPTIndex.init() got an unexpected keyword argument 'llm_predictor'
Running into this as well.
@alexzhang2015 @lxe the latest versions of llama index made some changes. The llm predictor goes into a new service context object
https://gpt-index.readthedocs.io/en/latest/guides/primer/usage_pattern.html#customizing-llm-s
same issue
Same issue, are there any updates?
@ctemple @lxkaka there are a few issues in this thread. What's the current issue you are facing?
in site-packages\gpt_index\indices\prompt_helper.py: within def get_chunk_size_given_prompt( there is : result = (self.max_input_size - num_prompt_tokens - self.num_output) // num_chunks
I usually set num_output=3000, after [ValueError: Got a larger chunk overlap (20) than chunk size (-nnn), should be smaller.] error raised, I change to num_output=1500, then the error gone. my chunk_size_limit = 1000, if I want to get longer responds, maybe I should reduce chunk_size_limit too, to reduce num_prompt_tokens during query.
Latest versions of llama-index (v0.6.20) have simplified this process quite a bit
Feel free to re-open this issue if it is still happening, but going to close this for now.
Still happening reliably when using the MockLLMPredictor to measure output/develop faster, but I can't quite base-case what's happening.
Our basic workflow (skeleton code at the bottom of this comment) is:
1) Create a SimpleDirectoryReader on two folders, with GPTVectorStoreIndexes on top of them
2) Create a ComposableGraph over the indexes
3) graph.as_query_engine.query()
something to return content for new files in those folders
4) Write the new content to those folders
5) Start over at step 1 and repeat a few times
We've added these two print
lines in the code that instantiates TokenTextSplitter:
def get_text_splitter_given_prompt(
self, prompt: Prompt, num_chunks: int = 1, padding: int = DEFAULT_PADDING
) -> TokenTextSplitter:
"""Get text splitter configured to maximally pack available context window,
taking into account of given prompt, and desired number of chunks.
"""
chunk_size = self._get_available_chunk_size(prompt, num_chunks, padding=padding)
if chunk_size == 0:
raise ValueError("Got 0 as available chunk size.")
chunk_overlap = int(self.chunk_overlap_ratio * chunk_size)
print(len(get_empty_prompt_txt(prompt)), num_chunks, padding, chunk_size) # this
print(self.chunk_overlap_ratio, chunk_overlap) # and this
text_splitter = TokenTextSplitter(
separator=self._separator,
chunk_size=chunk_size,
chunk_overlap=chunk_overlap,
tokenizer=self._tokenizer,
)
return text_splitter
...and are getting the following output:
=== Iteration 1
6966 1 5 1624
0.1 162
6778 1 5 1667
0.1 166
6966 1 5 5719
0.1 571
6778 1 5 5762
0.1 576
=== Iteration 2
6721 1 5 1702
0.1 170
6533 1 5 1745
0.1 174
6721 1 5 1702
0.1 170
6533 1 5 1745
0.1 174
=== Iteration 3
10735 1 5 108
0.1 10
10547 1 5 151
0.1 15
11427 1 5 10
0.1 1
12022 1 5 -75
0.1 -7
We're not using any manual values for chunk size/count/input size/etc, just passing default prompts in. This also has not happened yet with a regular LLMPredictor. Any idea what could be going wrong? We'd like to be able to use the MockLLMPredictor to speed up dev work, but can just use the regular one if we need to.
The watered-down version of our code that I was using but wasn't able to base-case is:
import os
from dotenv import load_dotenv
load_dotenv()
# For creating the indexes
from langchain.chat_models import ChatOpenAI
from llama_index import (
StorageContext,
load_index_from_storage,
GPTVectorStoreIndex,
MockLLMPredictor,
LLMPredictor,
ServiceContext
)
# For creating the graph query engine
from llama_index.indices.composability import ComposableGraph
# For indexing documents
from llama_index.readers import Document
from llama_index import SimpleDirectoryReader
mock_llm_predictor = MockLLMPredictor()
service_context = ServiceContext.from_defaults(llm_predictor=mock_llm_predictor)
skip_extensions = [".pdf", ".docx", ".pptx", ".jpg", ".png", ".jpeg", ".mp3", ".mp4", ".csv", ".epub", ".md", ".mbox", ".ipynb", ".json"]
exclude = ["**/*" + ext for ext in skip_extensions]
exclude.append(".git/**/*")
subdirs = ['a', 'b']
# Logic more complicated than "range", but effectively...
for x in range(1, 3):
indexes = {}
top_dir = './repos/'
for subdir in subdirs:
docs = SimpleDirectoryReader(
f"{top_dir}/{subdir}",
exclude_hidden=False,
exclude=exclude,
recursive=True,
file_metadata=lambda file : {"file_path": file}
).load_data()
index = GPTVectorStoreIndex.from_documents(docs, service_context=service_context)
indexes[subdir] = index
# We don't persist this because it's tiny and doesn't take much time to generate
graph = ComposableGraph.from_indices(
GPTVectorStoreIndex,
[indexes[subdir] for subdir in indexes],
index_summaries=[f"Files in {subdir} dir" for subdir in indexes],
service_context=service_context,
root_id="root_id"
)
query_engine = graph.as_query_engine()
result = query_engine.query('test?')
for subdir in subdirs:
with open(f"{top_dir}/{subdir}/new-file.txt", 'w') as f:
f.write(result.response)
@ahwitz I ran for 10 iterations and found no error with your code. Are you able to share the data you were reading in initially?
Definitely can't share the source data, very likely can't share the source code.
I spent a bit more time basecasing and got a very reliable means of triggering the error, in the Python below. Notes on this:
LLMPredictor(llm=ChatOpenAI(temperature=0, model_name="gpt-4"))
? If so, that'd explain the bug from before.Name this example.py
and run python example.py
. Note the lorem_text
import at the top, and paragraphs(20)
towards the bottom.
import os
from lorem_text import lorem
from dotenv import load_dotenv
load_dotenv()
# For creating the indexes
from llama_index import (
GPTVectorStoreIndex,
MockLLMPredictor,
LLMPredictor,
ServiceContext
)
# For indexing documents
from llama_index import SimpleDirectoryReader
mock_llm_predictor = MockLLMPredictor()
service_context = ServiceContext.from_defaults(llm_predictor=mock_llm_predictor)
subdirs = ['jerryjliu/llama_index']
# Logic more complicated than "range", but effectively...
for x in range(1, 3):
indexes = {}
top_dir = './repos/'
for subdir in subdirs:
docs = SimpleDirectoryReader(
f".",
input_files=["example.py"],
file_metadata=lambda file : {"file_path": file}
).load_data()
index = GPTVectorStoreIndex.from_documents(docs, service_context=service_context)
indexes[subdir] = index
query_engine = indexes[subdir].as_query_engine()
result = query_engine.query(lorem.paragraphs(20))
print(result)
A very easy (and, seemingly, reliably) way to trigger this error is:
model = 'gpt-4'
mode_max_tokens = BaseOpenAI.modelname_to_contextsize(model)
llm = ChatOpenAI(
temperature=0,
model_name=model,
max_tokens=model_max_tokens
)
service_context = ServiceContext.from_defaults(
chunk_size=model_max_tokens
)
...because, for that combination, llama_index.indices.prompt_helper.py
:
def _get_available_context_size(self, prompt: Prompt) -> int:
"""Get available context size.
This is calculated as:
available context window = total context window
- input (partially filled prompt)
- output (room reserved for response)
"""
empty_prompt_txt = get_empty_prompt_txt(prompt)
prompt_tokens = self._tokenizer(empty_prompt_txt)
num_prompt_tokens = len(prompt_tokens)
print(self.context_window, num_prompt_tokens, self.num_output)
return self.context_window - num_prompt_tokens - self.num_output
... self.context_window - self.num_output
will always be 0 that way, and a sufficiently large prompt should trigger negatives.
Yea but thats more user error at that point, rather than some error with our text chunking 🤔
Although I guess the error could maybe be more descriptive if possible? Hmm
Yeah, the error message is definitely the problem here. "Check your chunk size settings" is a lot easier to debug than the sorta-generic ValueError, but I'm still not convinced that I've found the only way to trigger it, and I don't know if there's a reliable way to snip off this set of error cases and handle them upstream.
A lot of our problems right now are dealing with trying to maximize the context window, so I'm not surprised we keep butting into this now that I know what the situation is.
Hi, @drhedri1! I'm Dosu, and I'm helping the LlamaIndex team manage their backlog. I wanted to let you know that we are marking this issue as stale.
From what I understand, you encountered a ValueError
when running the code, which indicated a larger chunk overlap than the specified chunk size. In the comments, there were suggestions to increase the max input size, pass in the llm_predictor and prompt_helper again when loading from disk, and check the query string length. It seems that you followed these suggestions and were able to resolve the issue.
Before we close this issue, we wanted to check if it is still relevant to the latest version of the LlamaIndex repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.
Thank you for your contribution to the LlamaIndex repository!
I continue to get this issue on my code: def construct_index(directory_path, api_key): max_input_size = 0.4 num_output = 100 max_chunk_overlap = 20 chunk_size_limit = 600
Please help!!