Closed delvinso closed 2 years ago
Hey @delvinso thanks for using pysbd.
Unfortunately, there is no specific documentation about modifying rules as there are so many and each rule is associated with some form of transformation which is taken as a input by other rule.
To illustrate it further:
As you can see above, all those operations needs to be performed in that sequence as they are interrelated. The way these are structured are https://github.com/diasks2/pragmatic_segmenter decision choice, I just ported those from Ruby to Python.
The way to tackle your edge cases would be by diving in the source code and see where your sentence is getting segemented wrongly?
Best way is to use python debugger and see how your input text goes through different transformations to get clean sentence.
Let me know if this helps
Closing the issue as there is no specific documentation for this.
Hi apologies if this is documented - I've looked at current and past issues as well and the only reference I could find is #90 but there doesn't seem to be an explanation. For reference this is the original issue:
Are there any examples of how to modify the current rules in place? I'm looking to use this for clinical text and it seems to offer improvements over another, default implementation of sentence segmentation, particularly when it comes to handling lists.
Thanks!