Large documents need to be chunked otherwise tokens exceeding the model's limit won't be used.
MVP for default word based chunking strategy:
Use a sliding window approach
Chunk into 200 words
Try splitting on whitespace or newline if possible
Maybe fallback to only doing it based on character length (don't have to search for whitespace/newlines, and also avoids problems with languages that don't really use whitespace)
MVP for configurable chunking settings:
Allow users to provide chunking configurations to select between word and sentence based chunking strategies when creating an inference endpoint.
Fallback to default word based chunking strategy above
Post MVP features for configurable chunking settings:
Allow users to enable chunking when calling to perform an inference through the Inference API
Allow users to configure chunking as part of their ingestion pipeline
Description
Large documents need to be chunked otherwise tokens exceeding the model's limit won't be used.
MVP for default word based chunking strategy:
MVP for configurable chunking settings:
Post MVP features for configurable chunking settings:
Tasks already completed: