Closed LordPachelbel closed 6 years ago
Sure it can. You can see the tokenizers documentation for a crude example. You don't necessarily need to follow a similar rule based strategy (although for your problem I would recommend it), you can even train an NaiveBayes classifier to split the sentences.
Keep in mind that NlpTools is a library that provides tools to build your own solutions, this means that there exists no SentenceTokenizer by default.
I'm working on an events calendar, and for each event I need to automatically populate
<meta name="description">
and<meta property="og:description">
tags from the event description text because the database doesn't have a field for users to enter meta data separately.Rather than just truncate the text at an arbitrary number of characters, I would like to extract the first sentence from each description. Can this library be used to do that?
Because doing something like
won't work for sentences that end with
!
or?
or?!
, nor will it work if the first sentence contains things likeMr.
,Dr.
,i.e.
,e.g.
, etc.