caikit / caikit-nlp

Apache License 2.0
12 stars 45 forks source link

:sparkles: Bidirectional streaming for regex sentence splitting #346

Open evaline-ju opened 4 months ago

evaline-ju commented 4 months ago

The regex sentence splitter is not a very accurate sentence splitter but we would like to provide an initial implementation of aggregation and splitting for bidirectional streaming use, in the case of streaming text chunks/tokens needing to be aggregated to sentences for further sentence analysis.

For tracking purposes, output streamed sentences remain directly concatenable.

Closes: https://github.com/caikit/caikit-nlp/issues/345