patrickfrey / strusAnalyzer

Library for document analysis (segmentation, tokenization, normalization, aggregation) with the goal to get a set of items that can be inserted into a strus storage. Also some functions for analysing tokens or phrases of the strus query are provided.
http://www.project-strus.net
Mozilla Public License 2.0
3 stars 0 forks source link

punctuation method, meaning of second parameter #52

Open andreasbaumann opened 7 years ago

andreasbaumann commented 7 years ago
punctuation   producing punctuation elements (end of sentence recognition). The language is specified as parameter (currently only german 'de' and english 'en' supported).
    sentence = orig punctuation("en","") /post/post/body//para();

What is the meaning of the second parameter?

patrickfrey commented 7 years ago

The second parameter specifies a set of characters that should also be recognized as punctuation besides the end of a sentence. The default (if not specified) is reasonable for European languages.