richtr / guessLanguage.js

A natural language detection library based on trigram statistical analysis for Node.js and the Web.
http://richtr.github.com/guessLanguage.js/
211 stars 39 forks source link

detection of zh_TW and zh_CN #13

Open vandenoever opened 8 years ago

vandenoever commented 8 years ago

This text, copied from http://www.gov.tw is detected as being zh. 駕駛執照替代役水質補(捐)助資訊出國疫苗地價稅牌照稅退休金技能檢定生育補助交通事故天災事變生活扶助健康檢查消費者身心障礙健康保險勞工保險教育補助 It would be nice if the detection was more specific (zh-TW). The documentation claims to detect "zh": Chinese and "zh-TW" Chinese (Taiwan).

dbw9580 commented 7 years ago

Same issue here. It seems to fail to differentiate between zh-CN(Simplified Chinese) and zh-TW(Traditional Chinese).