segment-any-text / wtpsplit

Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.
MIT License
667 stars 39 forks source link

use concrete error types #28

Closed drahnr closed 1 year ago

drahnr commented 3 years ago

There is a bit of an issue with the error bounds in rust when being as lax as Box<dyn Error> - most error frameworks expect the error type bounded to be Error + Send + 'static.

For a library it's common to implement a custom error type which is then exposed to the user, which wraps all possible internal error types. Currently the tool of choice (imho) is thiserror.

Moving to concrete error types rather than dyn boxes would be auch appreciated step.

bminixhofer commented 3 years ago

Thanks for this issue, I agree. I also think thiserror is the best choice.

By the way, just to make sure you know: nlprule does sentence boundary detection internally using srx now: https://github.com/bminixhofer/nlprule/pull/22

In any case this is a valid issue. PRs are welcome, I might also implement this myself.

drahnr commented 3 years ago

Thanks, I missed that PR. That simplifies life for me significantly :)