Open KarinaBunyik opened 10 years ago
The problem with my previous attempt was that mallet takes the delimiter of n-gramms as '_' by default. I should set it to ' '(space) when I want bigrams.
Bigrams don't seem to work on unicode character. So I suppose only english text is ok with bigrams. Ask Dimitri if he tried it.
Add bigramms to Mallet. There is an error when bigramms added: the Swedish characters are not recognized.