komodojp / tinyld

Simple and Performant Language detection library for NodeJS
https://komodojp.github.io/tinyld/
MIT License
415 stars 12 forks source link

Not detecting this Chinese text: 地图箭头方向与实际情况相反 #26

Open lucasrmendonca opened 6 months ago

lucasrmendonca commented 6 months ago

Using tinyld version 1.3.4

Steps to reproduce:

import { detect } from 'tinyld';

const detectedLanguage = detect("地图箭头方向与实际情况相反");
console.log(detectedLanguage)

Output

''

Expected output:

zh
thewilkybarkid commented 6 months ago

Looking at the Playground (https://komodojp.github.io/tinyld/), only the heavy version recognises it.

lucasrmendonca commented 6 months ago

Correct me if I'm wrong here, but shouldn't the presence of Asian unicode characters in the string be enough for tinyld to at least guess it must be one of the asian languages?

I'm not sure how it works under the hood, but it feels strange that it requires the heavy version for this