Using the following text as input, the sentences ending in a question mark are
not detected as sentences.
input = "This is the tale of Mr. Morton. Who is Mr. Morton? He is the subject
of our tale, and the predicate tells what Mr. Morton must do. Here's a short
sentence. Mister Morton is who?\nHere's another short sentence."
The resulting split lines are the following:
This is the tale of Mr. Morton.
Who is Mr. Morton? He is the subject of our tale, and the predicate tells what
Mr. Morton must do.
Here's a short sentence.
Mister Morton is who? Here's another short sentence.
I would expect the sentences to split after both of the question marks.
This problem occurs with Splitta versions 1.03 and svn r21, under Linux and OS
X 10.8.4, with Python 2.7.2.
Any help with this problem would be enormously appreciated, as we are
attempting to use Splitta as a crucial component in an NLP pipeline for a
summer camp at JHU that is underway:
http://hltcoe.jhu.edu/research/scale-workshops/
Thank you!
Original issue reported on code.google.com by orl...@gmail.com on 11 Jun 2013 at 2:16
Original issue reported on code.google.com by
orl...@gmail.com
on 11 Jun 2013 at 2:16