Open julien-nc opened 1 week ago
I'm looking at the OpenAI API reference and I can't find any information regarding splitting an input prompt into chunks. If the prompt is longer than context_size - max_output_tokens
, is it possible that the API automatically chunks the input for us?
Here's the documentation for the endpoints used by the summary provider: https://platform.openai.com/docs/api-reference/chat/create https://platform.openai.com/docs/api-reference/completions/create
Let's implement chunking in the same way it was done in LLM2 to allow summarizing texts that are longer than the model context size.