greyblake / whatlang-rs

Natural language detection library for Rust. Try demo online: https://whatlang.org/
https://whatlang.org/
MIT License
965 stars 108 forks source link

Fix typo and calculation code in Japanese detection. #123

Closed miiton closed 1 year ago

miiton commented 2 years ago

Hi.

I think it's probably wrong, so I've fixed the following:

greyblake commented 1 year ago

Quality benchmarks:

Before:

| LANG     | AVG    | <= 20  | 21-50  | 51-100 | > 100  |
|----------|--------|--------|--------|--------|--------|
| Mandarin | 96.08% | 97.54% | 96.92% | 95.45% | 94.43% |
| Japanese | 94.38% | 93.97% | 96.04% | 94.31% | 93.18% |
| AVG      | 95.23% | 95.76% | 96.48% | 94.88% | 93.80% |

OVERALL: 2 languages
AVG: 95.23%

After

| LANG     | AVG    | <= 20  | 21-50  | 51-100 | > 100  |
|----------|--------|--------|--------|--------|--------|
| Mandarin | 96.09% | 97.54% | 96.92% | 95.45% | 94.43% |
| Japanese | 94.37% | 93.97% | 96.04% | 94.30% | 93.18% |
| AVG      | 95.23% | 95.76% | 96.48% | 94.88% | 93.81% |

OVERALL: 2 languages
AVG: 95.23%

It's slightly (0.01%) improved correct detection rate of Cmn and decreased correct detection rate of Japanese. On average it remains the same.

greyblake commented 1 year ago

@miiton Thanks!