Open GoogleCodeExporter opened 9 years ago
Same problem with French which ends up split on accentuated characters. I've
tried switching the file to UTF8 but it makes no difference. I'm running this
on a Mac OSX.
Original comment by patrick....@gmail.com
on 17 Aug 2012 at 4:33
I have the same problem in Persian.
Original comment by afshinra...@gmail.com
on 16 Dec 2012 at 3:21
Hello All,
I tried to run the code in eclipse for understanding how it work, so it can be
improve to hLDA. But, i got an error saying package.cs.mallet.gui is not
found.Kindly help on how to import the file to IDE eclipse or Netbean and run
successful.
Original comment by abiodunm...@gmail.com
on 7 Apr 2013 at 4:40
I have the same problem in Greek. Only the English words appear in the topics.
In command line mallet installation i can define --token-regex "[\p{L}\p{M}]+"
and then read UTF8 Greek. Is there a tokenization option here?
Original comment by gmik...@gmail.com
on 19 Jan 2014 at 1:15
Original issue reported on code.google.com by
Semenoff...@gmail.com
on 25 Nov 2011 at 4:59Attachments: