monkeytypegame / monkeytype

The most customizable typing website with a minimalistic design and a ton of features. Test yourself in various modes, track your progress and improve your speed.
https://monkeytype.com/
GNU General Public License v3.0
15.76k stars 2.4k forks source link

Accuracy calculation not working properly in Vietnamese #4218

Open Miaozxje opened 1 year ago

Miaozxje commented 1 year ago

Did you clear cache before opening an issue?

Is there an existing issue for this?

Does the issue happen when logged in?

Yes

Does the issue happen when logged out?

Yes

Does the issue happen in incognito mode when logged in?

Yes

Does the issue happen in incognito mode when logged out?

Yes

Account name

No response

Account config

No response

Current Behavior

When typing in Vietnamese, we have to typing 2 or 3 letters in other to making 1 correct letter And it seems like monkeytype calculate those characters as incorrect

Example: when we typing "ước" it's actually the combination of

(Telex and VNI is 2 most common Vietnamese typing method) and monkeytype calculate that we have 2 errors

Expected Behavior

Base on the example above Monkeytype should count those input as "Correct" and not "Incorrect" as it's now

Steps To Reproduce

  1. Open Monkeytype
  2. Select Vietnamese as typing language
  3. Typing in with Telex or VNI encode

Environment

Anything else?

No response

ngntrgduc commented 1 year ago

Same problem. I think it is better to remove errors when typing in Vietnamese, or maybe check if a word is correct after user press Space key), because with a custom test of "ước" like above, for the Telex method:

All possible ways to type "ước" in Telex (as I remember 🥲): uocws, uocsw, uowcs, uowsc, uwowsc, uwowcs, uoswc, uoscw, usowc, usocw, uwosc, uwocs, ... (and maybe more). And I think it all be count as Correct.

I don't use VNI method but I think there will be the same issue with VNI method.

Miodec commented 1 year ago

All possible ways to type "ước" in Telex (as I remember 🥲): uocws, uocsw, uowcs, uowsc, uwowsc, uwowcs, uoswc, uoscw, usowc, usocw, uwosc, uwocs, ... (and maybe more). And I think it all be count as Correct.

giphy

Miodec commented 1 year ago

So, I just gave it another test, and it seems to be working fine, giving me 100% accuracy (on Windows 11 22H2 and MacOS 13.4). I tried typing ước with uocws, uowcs and uwowcs - all gave me 100% accuracy.

ngntrgduc commented 1 year ago

But in my case, with a custom test:

image

image

image

image

All of them give me different results 🥲.

Miodec commented 1 year ago

So, I just learned about these - are you using Unikey, EVKey or OpenKey?

ngntrgduc commented 1 year ago

I'm currently using Unikey 4.3 RC5.

Miodec commented 1 year ago

Can you just quickly test without it? Using the native system Telex layout?

ngntrgduc commented 1 year ago

No, I don't have any system Telex layout 🥲.

Miodec commented 1 year ago

Well, can you add it and test with it?

ngntrgduc commented 1 year ago

As far as I know, most people in my country use third-party apps like Unikey. So I think it will be better to solve the problem with Unikey application.

And yes. I added the system Telex layout and test it (on Window 11 22H2). The result is the same as yours.

Miaozxje commented 1 year ago

Well, can you add it and test with it?

here is a video of me typing using EVKey 5.0.1 https://www.youtube.com/watch?v=1D7RqEObf6A

Miodec commented 1 year ago

Which one do you guys think is the most popular one out of the three? If most of the Vietnameese typing population uses these kinds of software, its gonna be really hard to get this fixed, because it looks like those softwares simply NOT send an essential event (composition start and end) which would allow the correct accuracy calculation.

Miaozxje commented 1 year ago

Which one do you guys think is the most popular one out of the three? If most of the Vietnameese typing population uses these kinds of software, its gonna be really hard to get this fixed, because it looks like those softwares simply NOT send an essential event (composition start and end) which would allow the correct accuracy calculation.

as far as I know, Unikey maybe the most common, but I opinion is you guys should separate its into 3 different typing language

Miodec commented 1 year ago

What do you mean by "separate its into 3 different typing langauge" ?

Miodec commented 1 year ago

So, Unikey website mentions that its source code has been integrated into MacOS since 2007 image

But when i try to test Vietnamese input using the built in input, it works without issues (100% accuracy is possible). So, something weird is going on, someone changed something at some point.

Could any of you confirm that UniKey is working / not working (its only available on Windows)

Miaozxje commented 1 year ago

So, Unikey website mentions that its source code has been integrated into MacOS since 2007 image

But when i try to test Vietnamese input using the built in input, it works without issues (100% accuracy is possible). So, something weird is going on, someone changed something at some point.

Could any of you confirm that UniKey is working / not working (its only available on Windows)

I can confirmed that UniKey does not working on Windows, as accuracy calculated kinda same as EVKey test video I uploaded above

Miaozxje commented 1 year ago

What do you mean by "separate its into 3 different typing langauge" ?

sorry for my lack of information just a idea that came to mind as that point when you shown that UniKey, EVKey and OpenKey sending different events that are needed for calculating accuracy in that situation, I think that if you guys can separate the language of Vietnamese into 3 (for Unikey, EVKey and OpenKey) but in the end, I realized it's not so smart to do this.

Miodec commented 1 year ago

Well, I didnt say UniKey, EvKey and OpenKey do something different to eachother. I said they all do something different compared to the native input method.

Miodec commented 1 year ago

I can confirmed that UniKey does not working on Windows, as accuracy calculated kinda same as EVKey test video I uploaded above

Well, thats not good.

Miaozxje commented 1 year ago

I found this website: https://vntype.web.app that are forked from Monkeytype which having better accuracy and wpm calculation in Vietnamese (I'm using EVKey). Can you look around to have idea about how they calculate these thing ?

Miodec commented 1 year ago

I found this website: https://vntype.web.app that are forked from Monkeytype which having better accuracy and wpm calculation in Vietnamese (I'm using EVKey). Can you look around to have idea about how they calculate these thing ?

The problem with that fork is that its very old, so its very hard to find the changes they made. And they haven't updated any of the github links / contact links, so I don't know who to contact about this.

SteveFour commented 6 months ago

Explain issue:

The general problem with Vietnamese's popular typing apps on Windows is the way they edit the letters.

Let's say the current word is "được". Using "Telex" input language (*), here's how the system response: Typing letter The actual word
d d
d đ
u đu
w đư
o đưo
w đươ
c đươc
j đưc

Bold & italic letter indicate what letter changed. (Usually) the same result, but in Vietnamese's input methods there are mainly two approaches to the problem.

I'm not particularly knowledge in this subject, but I'll cite the original sources, as well as explain it in my understanding.

"An input method software's job is to transform chains of letters into the desired word. In my case, from "dduwowcj" into "được".

Most of Vietnamese's popular input method solves this issue by using a "fake backspace". So when I input the second "d", the app sends a fake backspace to delete the first "d", and replace it with "đ".

But according to the source, preedit should be the "correct" solution to this typing problem, by creating a temporary buffer inside the app, allowing the input method to change easily. And only after the input method commit the preedit, does the text becomes part of the document... For this different way of editing text, most OSes recommends marking preedit text differently, usually an underline to the preedit text.

Preedit is commonly used in input methods for Japanese, Chinese, Korean, etc."

Windows default input method for Vietnamese is also using preedit. I find it not having the accuracy issue compared to the others. It's just not many people use it.

Propose solution:

My propose solution to the accuracy problem with input methods using fake backspace, is probably to make a "compatibility accuracy mode", which only counts mistakes after the word is written (space is pressed). In other words: Check mistakes only on completed words. This allows Monkeytype's mistake count the same way whether it is preedit, or fake backspace.

This compatibility solution works well for a lot of languages using input methods, and should be a default enabled for Vietnamese.

Sources:

These sources are in Vietnamese, there's little to no sources written in English:

https://lewtds.github.io/2014/07/31/uoc-mo-bo-go-kieu-unikey/

https://notes.huy.rocks/posts/go-tieng-viet-linux.html

I hope that this idea would help the team close this problem. The reason why this issue got stuck for so long is because the way input methods handle words is a very rare topic, and mostly from Linux users.

(*) I mentioned Telex in this context as a typing language (the language of translating "dduwowcj" into "được") to differentiate with Preedit and Fake backspace, which handles how the letters change when typing. I have to write this because Wikipedia also mentioned Telex an input method.

leanbesha commented 2 months ago

I find that when using the default Microsoft typing software, it does not have accuracy issues in Monkeytype. ssssss rrrrrrrrrrrr