Closed albbas closed 4 years ago
Date: 2019-10-23 19:19:34 +0200
From: Robert Reynolds <
In the following example, the token #
is deleted by the tokeniser.
$ echo "# – это не слово." | hfst-tokenize $GTHOME/langs/rus/tools/tokenisers/tokeniser-disamb-gt-desc.pmhfst – это не слово .
Date: 2019-12-17 09:50:06 +0100
From: Sjur Nørstebø Moshagen <
Fixed in svn revs 186224 and 186225:
$ echo "# – это не слово." | hfst-tokenise -g tools/tokenisers/tokeniser-disamb-gt-desc.pmhfst
"<#>"
"#" N Symbol |
---|
"<–>"
"–" PUNCT
And without the -g option:
$ echo "# – это не слово." | hfst-tokenise tools/tokenisers/tokeniser-disamb-gt-desc.pmhfst # – это не слово .
This issue was created automatically with bugzilla2github
Bugzilla Bug 2627
Date: 2019-10-23T19:19:34+02:00 From: Robert Reynolds <>
To: Sjur Nørstebø Moshagen <>
CC: borre.gaup, lene.antonsen, linda.wiechetek, trond.trosterud, unhammer+apertium
Last updated: 2019-12-17T09:50:06+01:00