Closed aashay96 closed 1 year ago
Hi @aashay96, by default the chunk limit is based off davinci (4096 tokens). For other LLM's, at the moment, you have a few options:
1) you can manually specify chunk_size_limit
when building the index, to split the text chunks in a way that will fit the prompt limitation. Note: as a rule of thumb you should set chunk size = the maximum input limit of the LLM - 200.
For instance,
index = GPTListIndex(documents, chunk_size_limit=256)
I have a TODO to automatically infer the chunk size limit depending on the LLM that you are using!
2) You can manually define a PromptHelper (this is not exposed at all in the docs right now, so I'm just giving you a code example - I'll leave a TODO!). You set max_input_size to the maximum input limit of the LLM.
from gpt_index.indices.prompt_helper import PromptHelper
....
max_input_size = <the maximum input size>
num_output = 256
max_chunk_overlap = 0
prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)
# pass in prompt helper into index during construction
index = GPTListIndex(documents, prompt_helper=prompt_helper)
Let me know if either of these options works for you
Will try it out, thanks!
@aashay96 was this still an issue?
going to close for now
I run into the following error when using gpt2 from huggingface -
ValueError: Error raised by inference API: Input is too long for this model, shorten your input or use 'parameters': {'truncation': 'only_first'} to run the model only on the first part.
Can the index not be build chunk by chunk? Or am I missing something?