Open extoplasm opened 6 months ago
If that can be fixed, I believe the spaces in the "words" section should be typed automatically too, as a sentence in Simplified Chinese does not include spaces. e.g.: In "只有 出现 革命 存在 发生 方法…", users should not need to hit the spacebar before entering the next word.
i reckon you keep the words the same, it’s good to separate the words
but just count every character in a sentence as a word in the quotes section
The characters used are full-width commas, and as @faq0 said, simplified chinese does not include spaces, so im not sure what should be done here.
I believe there are some commonly used full-width punctuation marks in simplified Chinese, which can be set as an exception in the quote mode.
e.g.: Some of these include ",。!?“”:;《》—", have the unicode \uff0c\u3002\uff01\uff1f\u201c\u201d\uff1a\uff1b\u300a\u300b\u2014
.
But for the zen or custom modes, they might need other rules as the punctuation marks are not limited to these characters.
However, I have noticed that, in fact, many Chinese typing practice websites do actually count symbols as a character, being calculated towards the WPM. That might be an easy way for that.
the punctuation isn't an issue, there isn't much punctuation in the quotes anyways, i reckon you can count every character as a word and parse out the full width punctuation or change it into its english equivalent when counting the words although this would be rough to implement.
it's really up to you, but as a quasi-mandarin speaker this is just my suggestion.
So, whats the solution? Because if you want to add spaces you would need to edit the quotes themselves.
wdym, i’m saying we count each character as a word, as mandarin doesn’t follow the rule that each word is separated by spaces. eg. “猴子打字” (monkey type lol) counted as 4 separate words
also if we add spaces it wouldn’t be accurate, not sure how the word counting works but a special case can be added to split the characters differently (removing the punctuation before of course)
So, this should be the case for all chinese text, not just quotes right.
Is this because you need multiple keypresses per character? Maybe we can count each keypress as a character, instead of each character as a word.
So, this should be the case for all chinese text, not just quotes right.
Yes.
Maybe we can count each keypress as a character, instead of each character as a word.
This would be good in most cases, but I believe that could be the way to calculate the speed, not the accuracy. In fact, there are mutliple typing methods in Simplified Chinese that might result in different number of keystrokes.
e.g.: For an example quote "我能吞下玻璃而不伤身体", In Full Pinyin, it would be "wonengtunxiabolierbushangshenti" (31 chars). In Double Pinyin, it would be "wongtpxwboliorbuuhufti" (2 keys/word, total 22 chars). For Wubi, that would be 4 keys/word, total 44 chars. But in this case, there is lower amount of time needed to select the desired Chinese characted in the candidate window.
yes i agree with faq0 on the speed calculation part but the main issue is that in the quotes the entire sentence is counted as one word, i’m suggesting that we split the quote by character instead of by space as when someone presses space the test ends and the progress is inaccurate
yes i agree with faq0 on the speed calculation part but the main issue is that in the quotes the entire sentence is counted as one word, i’m suggesting that we split the quote by character instead of by space as when someone presses space the test ends and the progress is inaccurate
If you split by character then the website will require you to press space between every chracter. When you type quotes normally, when do you press space? (not on monkeytype).
in chinese there is no such thing as a space lol if its like that then there might not be an easy solution perhaps make a special case??? because im like 50% sure its the same for any asian language, this could be good if adding quotes for other languages
If you split by character then the website will require you to press space between every chracter. When you type quotes normally, when do you press space? (not on monkeytype).
We might not press space for every character. In fact, there is a candidate window (IME window) to choose from a list of characters.
We might not press the space key.
If I want to type the character"我" in Full Pinyin, that would be:
What I type: w o <spacebar>
.
In this case, the candidate window will be (Microsoft Pinyin IME as an example):
I have to select one of the desired character in the candidate list, whereas "1" = "我", "2" = "喔", etc.. I can also press the spacebar as an alternative to select the first option (the spacebar is more commonly used than "1" when selecting the first option).
We might not press the key for every character.
In a longer sentence, such as "我能吞下玻璃而不伤身体", I can type the sentence at once. In Full Pinyin, this would be:
What I type: w o n e n g t u n x i a b o l i e r b u s h a n g s h e n t i <spacebar>
.
It is lucky that in this case, my desired sentence is at the first place. I can press spacebar.
However, if that isn't the case. I may have to select each character (or word) one by one, divided using the apostrophe shown in the IME. For example,
This means that there are many ways to type a sentence, with some of them not containing a spacebar keystroke. I believe that monkeytype should just detect the number of keystrokes when a character itself is typed.
in chinese there is no such thing as a space lol if its like that then there might not be an easy solution perhaps make a special case??? because im like 50% sure its the same for any asian language, this could be good if adding quotes for other languages
What if i just disable space then? Monkeytype wont try to "move to the next word" because there would be no "next word" and that "moving to the next word" wont even be triggered by the space. The only thing the space would be doing is interacting with the input manager, like it already does.
I believe that monkeytype should just detect the number of keystrokes when a character itself is typed.
what does this mean?
What if i just disable space then? Monkeytype wont try to "move to the next word" because there would be no "next word" and that "moving to the next word" wont even be triggered by the space. The only thing the space would be doing is interacting with the input manager, like it already does.
this should be good enough haha
what does this mean?
Keystroke per second is calculated based on the number of keystrokes, which will be shown on the final speed chart, while the accuracy and WPM is calculated based on the typed Chinese characters per second.
What if i just disable space then? Monkeytype wont try to "move to the next word" because there would be no "next word" and that "moving to the next word" wont even be triggered by the space. The only thing the space would be doing is interacting with the input manager, like it already does.
This should be a good idea, as long as it can deal with the speed and accuracy correctly.
another problem might be that 1 misspelt character results in the test being unable to finish, as when u disable space, it will stop the test from force finishing as monkeytype does not let you finish on a misspelt word.
im pretty sure you have to both split quote by character and disable spaces
i've done some thinking and this problem is present in nearly all text input based websites:
here
For Chinese and Japanese, WorldServer has a special way to count words. Each character is considered a word. For these languages we are, effectively, counting characters. When a user sees "Words" in the WorldServer UI (for example, in scoping) for Chinese and Japanese source languages it actually means "Characters".
https://docs.rws.com/791662/251856/sdl-worldserver-11-0-1/word-counting-algorithm
the best way, imo, is to count every character as a word, remove "spaces" when presenting input to user, and auto-nextword when they type a character
is there a way to auto-nextword?
where is the code to handle next words in the file system?
@extoplasm Which languages should use this per character way of calculating speed?
Also, are the calculated speeds accurate if you just change the typing speed unit to cpm in the settings?
@extoplasm Which languages should use this per character way of calculating speed?
japanese and chinese off the top of my head
Also, are the calculated speeds accurate if you just change the typing speed unit to cpm in the settings?
not sure can’t test rn i’m not at home
Also, are the calculated speeds accurate if you just change the typing speed unit to cpm in the settings?
They are accurate (I don't know if the data is accurate or not, but they indeed work) in the "words" section, but the results cannot be uploaded due to "Result data doesn't make sense" after multiple attempts.
The speed calculation doesn't even work in quotes.
When 1 character is mistyped, it will not auto proceed to the completion page (for a quote that shows as 1 total word). I have to press spacebar manually and it shows a CPM of 0, but with an accuracy of 95%.
When no characters are typed wrongly, it will still show the "Result data doesn't make sense" error.
Plus, I've noticed some wrongly written characters in the quotes section. How do I report these?
Plus, I've noticed some wrongly written characters in the quotes section. How do I report these?
is that my bad... oops you don't need to report this just make a PR
is that my bad... oops you don't need to report this just make a PR
PR added. Added some quotes as well. https://github.com/monkeytypegame/monkeytype/pull/5465
Looking at the data, it looks like you're reporting less keypresses than characters typed. Looks like the input system is eating up some of the keypress events (which seems to be the same issue as someone else just opened with Korean typing..)
Did you clear cache before opening an issue?
Is there an existing issue for this?
Does the issue happen when logged in?
Yes
Does the issue happen when logged out?
Yes
Does the issue happen in incognito mode when logged in?
Yes
Does the issue happen in incognito mode when logged out?
Yes
Account name
extoplasm
Account config
{"theme":"alduin","themeLight":"serika","themeDark":"serika_dark","autoSwitchTheme":false,"customTheme":false,"customThemeColors":["#323437","#e2b714","#e2b714","#646669","#000000","#d1d0c5","#ca4754","#7e2a33","#ca4754","#7e2a33"],"favThemes":[],"showKeyTips":true,"smoothCaret":"medium","quickRestart":"off","punctuation":false,"numbers":false,"words":10,"time":60,"mode":"quote","quoteLength":[0],"language":"chinese_simplified","fontSize":1.5,"freedomMode":true,"difficulty":"normal","blindMode":false,"quickEnd":false,"caretStyle":"default","paceCaretStyle":"default","flipTestColors":false,"layout":"default","funbox":"none","confidenceMode":"off","indicateTypos":"off","timerStyle":"mini","liveSpeedStyle":"off","liveAccStyle":"off","liveBurstStyle":"off","colorfulMode":false,"randomTheme":"off","timerColor":"main","timerOpacity":"1","stopOnError":"off","showAllLines":false,"keymapMode":"off","keymapStyle":"staggered","keymapLegendStyle":"lowercase","keymapLayout":"qwerty","keymapShowTopRow":"layout","fontFamily":"JetBrains_Mono","smoothLineScroll":false,"alwaysShowDecimalPlaces":false,"alwaysShowWordsHistory":false,"singleListCommandLine":"manual","capsLockWarning":true,"playSoundOnError":"off","playSoundOnClick":"9","soundVolume":"1.0","startGraphsAtZero":true,"showOutOfFocusWarning":true,"paceCaret":"pb","paceCaretCustomSpeed":1,"repeatedPace":true,"accountChart":["on","on","on","on"],"minWpm":"off","minWpmCustomSpeed":100,"highlightMode":"letter","typingSpeedUnit":"wpm","ads":"result","hideExtraLetters":false,"strictSpace":false,"minAcc":"off","minAccCustom":90,"monkey":false,"repeatQuotes":"off","oppositeShiftMode":"off","customBackground":"","customBackgroundSize":"cover","customBackgroundFilter":[0,1,1,1,1],"customLayoutfluid":"qwerty#dvorak#colemak","monkeyPowerLevel":"off","minBurst":"off","minBurstCustomSpeed":100,"burstHeatmap":true,"britishEnglish":false,"lazyMode":false,"showAverage":"off","tapeMode":"off","maxLineWidth":0}
Current Behavior
when typing in chinese, entire quote is treated as one word -> whenever space is pressed the test finishes, also every quote is in the short category.
Expected Behavior
could count every character excluding punctuation as a word
Steps To Reproduce
Environment
Anything else?
No response