mediacloud / sentence-splitter

Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.
Other
225 stars 29 forks source link

Are these two sources in sync? #3

Closed umeshksingla closed 5 years ago

umeshksingla commented 5 years ago

There is another sentence splitting logic here in moses-smt/mosesdecoder repository at this link, based on the preprocessor by Philipp Koehn and this package also claims it to be based on a heuristic algorithm by the same guy.

Is the logic implemented for sentence splitting in this python package in sync with the other one? How do I decide which one to use?

pypt commented 5 years ago

Well, for one thing, one is written in Perl, another one in Python :)

As per the README, this module is a Python rewrite of Perl's Lingua::Sentence with some additions, and Lingua::Sentence is based on Koehn's logic itself.

The final choice is up to you!