amaiya / onprem

A tool for running on-premises large language models with non-public data
https://amaiya.github.io/onprem
Apache License 2.0
684 stars 32 forks source link

Segment needs to accept arguments in extractor pipeline #70

Closed mzientek closed 3 months ago

mzientek commented 3 months ago

In the apply method of the Extractor class, unit and maxchars parameters are accepted, but these parameters are not passed to the call to segment, so all segmentation is happening at the defaults (paragraph and 2048, respectively).

amaiya commented 3 months ago

Thanks - should be fixed in 0.1.2.