segment-any-text / wtpsplit

Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.
MIT License
758 stars 44 forks source link

2.1.1 treats \n as empty string #141

Open alexander-0000 opened 2 weeks ago

alexander-0000 commented 2 weeks ago

Updating to 2.1.1 causes the behavior of treating '\n' as "".

markus583 commented 1 day ago

It should not be an empty string, but a space (" "). This is intended and can be controlled with the _treat_newline_asspace flag. If you wish to restore the previous behavior, please set it to True. See here for the implementation.

If you observe any different behavior, please let us know.