supertanglang / scim-googlepinyin

Automatically exported from code.google.com/p/scim-googlepinyin
1 stars 0 forks source link

enlarge the corpus #5

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
the scale of LM from android version is much smaller than that from Windows
version.

this is a long standing problem. we need 
 - collect new words
 - fetch balanced corpus from internet
 - segment them
 - get statistical

Original issue reported on code.google.com by tchai...@gmail.com on 30 Aug 2009 at 11:40