Closed developomp closed 2 weeks ago
Hi, and thanks for all the information! I think I'll be able to spin up Korean text generation pretty easily now, but first I'd like to make sure that Hangul input works nicely. When implementing Arabic support for the text widgetry, I added a way to indicate when a character has been partially typed (it's marked with an underline), so I just have to flesh that out a little more so it plays along with the underlying Hangul input system as well. When I have done that, I'll add Korean to the text generation infrastructure and notify you.
Hi, the Korean implementation is ready for testing! If you build the app from the main branch, it should be available in the Text Language dialog. You didn't say anything specific about Korean punctuation conventions, so for now it just uses the same rules as English. Let me know if anything should be changed.
For some reason which I've yet to investigate, the application crashes as soon as I type anything in Korean (using ibus-hangul). This is regardless of the keypunch language settings. It's 1 AM here so I'll go to bed for now.
Logs:
Application started at 12:59:16 AM
thread 'main' panicked at src/widgets/text_view/input.rs:105:25:
already borrowed: BorrowMutError
stack backtrace:
0: rust_begin_unwind
at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/panicking.rs:652:5
1: core::panicking::panic_fmt
at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/panicking.rs:72:14
2: core::cell::panic_already_borrowed
at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/cell.rs:770:5
3: core::cell::RefCell<T>::borrow_mut
at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/cell.rs:1061:25
4: keypunch::widgets::text_view::input::<impl keypunch::widgets::text_view::imp::KpTextView>::push_typed_text
at ./Dev/projects/keypunch/src/widgets/text_view/input.rs:105:9
5: keypunch::widgets::text_view::input::<impl keypunch::widgets::text_view::imp::KpTextView>::setup_input_handling::{{closure}}
at ./Dev/projects/keypunch/src/widgets/text_view/input.rs:19:17
6: gtk4::auto::im_context::IMContextExt::connect_commit::commit_trampoline
at ./.var/app/org.gnome.Builder/cache/gnome-builder/projects/keypunch/builds/dev.bragefuglseth.Keypunch.Devel.json-flatpak-org.gnome.Platform-46-x86_64-main/cargo-home/registry/src/index.crates.io-6f17d22bba15001f/gtk4-0.8.2/src/auto/im_context.rs:270:13
7: g_cclosure_marshal_VOID__STRINGv
8: <unknown>
9: <unknown>
10: g_signal_emit_by_name
11: g_cclosure_marshal_VOID__STRINGv
12: <unknown>
13: <unknown>
14: g_signal_emit_valist
15: g_signal_emit
16: <unknown>
17: <unknown>
18: gtk4::auto::im_context::IMContextExt::reset
at ./.var/app/org.gnome.Builder/cache/gnome-builder/projects/keypunch/builds/dev.bragefuglseth.Keypunch.Devel.json-flatpak-org.gnome.Platform-46-x86_64-main/cargo-home/registry/src/index.crates.io-6f17d22bba15001f/gtk4-0.8.2/src/auto/im_context.rs:177:13
19: keypunch::widgets::text_view::caret::<impl keypunch::widgets::text_view::imp::KpTextView>::update_caret_position
at ./Dev/projects/keypunch/src/widgets/text_view/caret.rs:137:17
20: keypunch::widgets::text_view::imp::KpTextView::typed_text_changed
at ./Dev/projects/keypunch/src/widgets/text_view.rs:160:13
21: keypunch::widgets::text_view::input::<impl keypunch::widgets::text_view::imp::KpTextView>::setup_input_handling::{{closure}}
at ./Dev/projects/keypunch/src/widgets/text_view/input.rs:36:17
22: gtk4::auto::im_context::IMContextExt::connect_preedit_changed::preedit_changed_trampoline
at ./.var/app/org.gnome.Builder/cache/gnome-builder/projects/keypunch/builds/dev.bragefuglseth.Keypunch.Devel.json-flatpak-org.gnome.Platform-46-x86_64-main/cargo-home/registry/src/index.crates.io-6f17d22bba15001f/gtk4-0.8.2/src/auto/im_context.rs:330:13
23: <unknown>
24: <unknown>
25: g_signal_emit_by_name
26: <unknown>
27: <unknown>
28: g_signal_emit_valist
29: g_signal_emit
30: g_closure_invoke
31: <unknown>
32: <unknown>
33: g_signal_emit_valist
34: g_signal_emit
35: <unknown>
36: g_closure_invoke
37: <unknown>
38: <unknown>
39: g_signal_emit_valist
40: g_signal_emit
41: <unknown>
42: <unknown>
43: <unknown>
44: <unknown>
45: g_main_context_iteration
46: <unknown>
47: <unknown>
48: <unknown>
49: <unknown>
50: <unknown>
51: <unknown>
52: <unknown>
53: <unknown>
54: <unknown>
55: g_closure_invoke
56: <unknown>
57: <unknown>
58: g_signal_emit_valist
59: g_signal_emit
60: <unknown>
61: <unknown>
62: <unknown>
63: <unknown>
64: g_main_context_iteration
65: g_application_run
66: gio::application::ApplicationExtManual::run_with_args
at ./.var/app/org.gnome.Builder/cache/gnome-builder/projects/keypunch/builds/dev.bragefuglseth.Keypunch.Devel.json-flatpak-org.gnome.Platform-46-x86_64-main/cargo-home/registry/src/index.crates.io-6f17d22bba15001f/gio-0.19.5/src/application.rs:29:13
67: gio::application::ApplicationExtManual::run
at ./.var/app/org.gnome.Builder/cache/gnome-builder/projects/keypunch/builds/dev.bragefuglseth.Keypunch.Devel.json-flatpak-org.gnome.Platform-46-x86_64-main/cargo-home/registry/src/index.crates.io-6f17d22bba15001f/gio-0.19.5/src/application.rs:22:9
68: keypunch::main
at ./Dev/projects/keypunch/src/main.rs:58:5
69: core::ops::function::FnOnce::call_once
at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
Application exited
That's weird. Could you provide a sample text so I can try to reproduce it? Also, are you sure this happens with the latest development version?
From analyzing the stack trace you provided, I think I've managed to identify and fix the issue. It was an unusually tough one! Could you try again now?
Seems like we're not using the exact same input method setup, by the way. Have you configured your hangul-ibus in any way, or are you using the default preferences? Due to the way things in the app are set up now, you may or may not encounter problems with inserting the correct text even if it doesn't crash…
Tested 1c7b49727d5b985e8ce779aa8d2a50573990a719 with the following ibus-hangul settings.
Keypunch does not crash anymore, but pressing a key marks the syllable as incorrect and moves on the the next one.
e.g. when typing out 나비, pressing ㄴ marks 나 as incorrectly typed. Pressing ㅏ then marks 비 incorrect and so on.
That was actually what I expected to happen, and I think I know why now. Keypunch apparently misuses the IMContext.reset()
function a little, so I'll have to reduce its usage to where it's appropriate :)
Alright, try now. If it was enough to limit the use of IMContext.reset()
, it should work. I suspect we may need to to invoke IMContext.set_cursor_location()
less too.
Tested commit defec44b157a74a57962205927694e7839255f71
It appears that some of the keystrokes are not picked up by keypunch at all. In the video I'm typing "새 없다 대부분 힘들다 적 여기 비하다" but that's not at all what the app sees.
It appears that some of the keystrokes are not picked up by keypunch at all.
It looks like that, but Keypunch should actually be well aware of what's going on. While you're "working on" a letter, ibus-hangul sends the current state of the letter to a special buffer called the "preedit string", and Keypunch listens to this as well to indicate that the letter is partially typed. This just isn't included in self.typed_text
, even though typed_text_changed()
is run every time the preedit string is changed as well.
I tried to reconstruct the first part of your example video myself, and weirdly enough it appears to work perfectly for me:
From studying the recording, it seems like the "swallowed" keystrokes are ones right after the completion of the previous letter. Does this only happen in Keypunch?
If you replace your debug println statement with this, you'll get a "log" of the state at each keystroke including the preedit string:
println!("{}/ {}", self.typed_text.borrow(), self.input_context.borrow().as_ref().unwrap().preedit_string().0);
Here's my log for the string 새 없다 대부분
:
/
/ ㅅ
/ 새
새/ 새
새/
새 /
새 / ㅇ
새 / 어
새 / 업
새 / 없
새 없/ 없
새 없/ ㄷ
새 없/ 다
새 없다/ 다
새 없다/
새 없다 /
새 없다 / ㄷ
새 없다 / 대
새 없다 / 댑
새 없다 대/ 댑
새 없다 대/ 부
새 없다 대/ 붑
새 없다 대부/ 붑
새 없다 대부/ 부
새 없다 대부/ 분
새 없다 대부분/ 분
If you do the same, we can compare them.
That's the final potential solution I can come up with right off the top of my head
Commit: bdca1a7a572990d9e8ad083efe34267920b6c3f0
From studying the recording, it seems like the "swallowed" keystrokes are ones right after the completion of the previous letter. Does this only happen in Keypunch?
Yes.
Here's a video with better view of my ibus-hangul config.
Seems like we're using the exact same input method configuration. It's weird that it behaves differently. Maybe there is an open issue for it in ibus-hangul's issue tracker? I assume this is Keypunch's fault, though.
The problem might actually be that the reset
function is called on the IM at all. Reading through ibus-hangul's source code and issues, it looks like its current approach is pretty fragile and involves some data-races/timing issues. Let's see if this works anyways.
I've dropped all kinds of IM context resetting on the main branch now, and cross my fingers hoping that it works!
Yeeeah, pretty sure I've found the culprit. TL;DR: ibus-hangul sometimes commits text while keaving the beginning of the next letter in the preedit buffer (which is reasonable), but then Keypunch called reset()
on ibus-hangul, essentially telling it to remove the preedit text. This is super specific and probably only applies to Korean, but I'm glad we could have the misuse resolved.
Where it should be reset is when the session ends, so there aren't any leftovers in the preedit buffer before next round starts. I've fixed that too.
Assuming that this works, all that’s left now is fixing WPM calculation for Korean (the currently reported values are probably 50-60% too low). Shouldn’t be too hard, as I just need to pull in hangeul-rs
and decompose any Hangul characters before counting the typed letters.
I'm away from home without my laptop. I'll test the code this friday as soon as I get back home.
Thank you for your effort despite the delay and seemingly endless roadbumps. Seriously, I have no idea how you manage to code for a language you've never spoken before.
Do you have paypal or a page at buymeacoffee.com? I'd like to buy you a coffee.
The code's working properly now :+1:
The only thing I can nitpick is that I can't see my mistakes, but that's a whole new topic. This issue can be closed now. :tada:
The code's working properly now 👍
Great to hear! If you’d like to, you can provide a name (and optionally a website), and I’ll credit you in the Orthography section of the about window.
Seriously, I have no idea how you manage to code for a language you've never spoken before.
Most of the heavy lifting is done by the ibus engines of the relevant languages, since they interpret keypresses and decide when letters should be committed to the actual buffer and what should be kept in the preedit string. Keypunch just needs to communicate properly with the engine to get/set the correct information, but since I’m new to how ibus works internally, there have been a few hurdles on the way there 😂
Do you have paypal or a page at buymeacoffee.com? I'd like to buy you a coffee.
Really appreciate the offer! I’m in a privileged economical situation right now, though, and it would feel wrong to accept money through donations/tips. Save it for somebody else whose work you appreciate 😇
This was the last roadblock I wanted to get past before doing a new release, I’ll prepare one in the coming days.
If you’d like to, you can provide a name (and optionally a website), and I’ll credit you in the Orthography section of the about window.
Name: Kim, Jimin (aka developomp) Website: https://developomp.com
English Name
Korean
Native Name
한국어
Syntax
there are 24 letters in the Korean language. 14 consonants: ㄱㄴㄷㄹㅁㅂㅅㅇㅈㅊㅋㅌㅍㅎ 10 vowels: ㅏㅑㅓㅕㅗㅛㅜㅠㅡㅣ
some vowels can combine to form composite vowels. e.g. ㅓ + ㅣ = ㅔ ㅕ + ㅣ = ㅖ ㅏ + ㅣ = ㅐ ㅑ + ㅣ = ㅒ
At least one consonant and one vowel combine to form a syllable.
source: https://abcdefridays.blogspot.com/2016/04/korean-alphabet-and-typography.html
e.g. consonant ㄴ (n) and vowel ㅏ (a) combine to become syllable 나 (na) consonant ㅂ (b) and vowel ㅣ (i) combine to become syllable 비 (bi) syllable 나 (na) and 비 (bi) combine to form word 나비 (nabi, butterfly)
the position of the letters are automatically determined by the computer depending on the input order. e.g. ㄱ + ㅏ = 가 ㄱ + ㅏ + ㅁ = 감 ㅁ + ㅏ + ㄱ = 막 ㄱ + ㅜ + ㅔ = 궤 ㄷ + ㅏ + ㄹ + ㄱ = 닭
The process of typing 나비 is as follows:
https://github.com/bragefuglseth/keypunch/assets/40858122/a16ddc81-67ce-4e38-8f56-425a1fa3871f
Since the syllable 나 is not saved as ㄴ + ㅏ in utf-8, and it is its own separate character, we need special logic to prevent keypunch from marking a syllable as invalid as we're typing them out.
I suggest disassembling each character (a korean syllable) to its letter components then comparing the resulting array of letters. Here is a typescript implementation of the said logic.
Implementation Assistance
Additional Information
Technically, the name of the Korean language is Hangul (한글). However, it is more common for it to be called 한국어 (which literally translates to “Korean Language”) in a computer/internet context, so that's what I'm going with.