Currently, the Inference API uses a word based chunker that can not be configured by the user. This change would allow users calling the Inference API:
Set and configure a chunking strategy when creating an inference endpoint. This chunking strategy will be used by default when performing inferences.
Provide a chunking strategy when performing an inference to override the inference endpoint's configured chunking strategy.
If no chunking strategy is provided, the API will continue to function as is using the existing word based chunking strategy.
Description
Currently, the Inference API uses a word based chunker that can not be configured by the user. This change would allow users calling the Inference API:
If no chunking strategy is provided, the API will continue to function as is using the existing word based chunking strategy.