Closed mariusnita closed 1 year ago
thanks @mariusnita, sorry about the issue. i'll take a look soon. Is this only with the tree index? Does it work for simple vector index?
I just tried, and I don't get any errors with the vector or list indexes.
BTW, I just noticed the vector index is 20x cheaper and faster to create, and seems to have much better question-answering performance than the tree index. (Although I was only able to create a partial tree index by discarding the failing chunks, so that may explain the poor performance.)
BTW, I just noticed the vector index is 20x cheaper and faster to create, and seems to have much better question-answering performance than the tree index. (Although I was only able to create a partial tree index by discarding the failing chunks, so that may explain the poor performance.)
Yeah it's a fair point, that's why the SimpleVectorIndex is the default mode in the quickstart :)
i've found the tree index to be more effective at 1) summarization (through construction of the tree itself), and decently ok at 2) routing, though of course embeddings can be used for (2) as well
Seeing the same error when using the code-davinci-002
model with the vector index:
llm_predictor = gpt_index.LLMPredictor(
llm=langchain.OpenAI(
temperature=0,
model_name="code-davinci-002"
)
)
index = gpt_index.GPTSimpleVectorIndex(
docs,
llm_predictor=llm_predictor
)
openai.error.InvalidRequestError: This model's maximum context length is 8191 tokens, however you requested 9549 tokens (9549 in your prompt; 0 for the completion). Please reduce your prompt; or completion length.
@mariusnita do you have sample data to help me repro by any chance? Feel free to DM me in the Discord
I had the same error with GPTSimpleVectorIndex and I was able to get around it by setting prompt_helper. for your information.
openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens, however you requested 4181 tokens (3925 in your prompt; 256 for the completion). Please reduce your prompt; or completion length.
max_input_size = 4096
num_output = 2000
max_chunk_overlap = 20
prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)
index = GPTSimpleVectorIndex.load_from_disk(
'index.json', prompt_helper=prompt_helper
)
Thanks @stanakaj. Yeah given the max input size this should be something gpt index handles under the hood, i'd be curious to see what the data is
@jerryjliu This is likely a bad example because it's probably not useful to feed SVGs into gpt-index; nonetheless this causes gpt-index to crash:
https://www.roojs.org/roojs1/fonts/nunito/nunito-v16-latin-italic.svg
Example program:
filename = "nunito-v16-latin-italic.svg"
with open(filename) as f:
contents = f.read()
docs = [gpt_index.Document(contents)]
llm_predictor = gpt_index.LLMPredictor(
llm=langchain.OpenAI(temperature=0, model_name="code-davinci-002")
)
index = gpt_index.GPTSimpleVectorIndex(docs, llm_predictor=llm_predictor)
The same file causes GPTTreeIndex to fail:
filename = "nunito-v16-latin-italic.svg"
with open(filename) as f:
contents = f.read()
index = gpt_index.GPTTreeIndex(
[gpt_index.Document(contents)],
)
thanks @mariusnita, taking a look now
Hi @mariusnita, just a quick note. The PR I linked partially fixes the issue but does not completely fix the issue around your use case. This is because at the moment, there's no way to appropriately pre-compute # tokens used for text-embedding-ada-002 (the tokenizer i use is not aligned with the tokenizer in openai): https://help.openai.com/en/articles/6824809-embeddings-frequently-asked-questions.
In the meantime, for your specific use case, can you manually set chunk_size_limit=4096 (a smaller number)? e.g.
index = GPTSimpleVectorIndex(docs, llm_predictor=llm_predictor, chunk_size_limit=4096)
Thank you. #266 fixes also my reported error.
Seeing a bunch of errors coming back from the OpenAI API:
I'm just doing
With a list of docs I created manually, which are just files.
I construct the docs like this: