issues
search
greyblake
/
whatlang-rs
Natural language detection library for Rust. Try demo online: https://whatlang.org/
https://whatlang.org/
MIT License
969
stars
109
forks
source link
Evaluate other language identification methods.
#117
Open
greyblake
opened
2 years ago
greyblake
commented
2 years ago
This is issue is a reminder for myself.
Possible options:
Chars frequencies
2-grams?
The most frequent words (100 or 1000)?
Smart/complex resolve between
LangA
and
LangB
by identifying traits that are present in one language and absent in another. - This could help when 2 languages have a very similar statistical characteristics.
Řehůřek and Kolkus (2009)
See:
Language Identification (wiki article)
This is issue is a reminder for myself.
Possible options:
See: