dumblob / ULKL

Uniform Latin Keyboard Layouts - intuitive, nationalized, multiplatform, powerful, and basically 100% mutually compatible (also with Dvorak)
The Unlicense
9 stars 0 forks source link

Add more detailed explanation, and contribution instructions #4

Open waldyrious opened 1 year ago

waldyrious commented 1 year ago

I'm a little confused as to how this layout is supposed to work. The README says

typing č will use the same finger as typing c. The same holds for Č and C - even here, only one modifier is needed for both cases - the Shift.

...but I don't understand how one does then disambiguate between typing c and č. Can you explain in a bit more detail how one would use this layout in practice? A diagram or two would probably be helpful, too.

Also, it would be nice to have contribution instructions so that people could contribute to the project.

dumblob commented 1 year ago

First of all, thanks for chiming in!

// ----------- czech dvorak ltgt, 1. and 2. level
//  Ú   É   Á   Ó   Ě   Ů   Ý   Ď   Í   Č   Ř   Š   Ž
//  ú   é   á   ó   ě   ů   ý   ď   í   č   ř   š   ž
//
//          "   <   >   P   Y   F   G   C   R   L   ?   Ť   Ň
//          '   ,   .   p   y   f   g   c   r   l   /   ť   ň
//
//          A   O   E   U   I   D   H   T   N   S   _   LF
//          a   o   e   u   i   d   h   t   n   s   -   LF
//
//          :   Q   J   K   X   B   M   W   V   Z
//          ;   q   j   k   x   b   m   w   v   z

(from https://github.com/dumblob/ULKL/blob/master/platform/x11/symbols/czed )

By "same finger" I meant the same finger but a different physical key :wink:.

A diagram or two would probably be helpful, too.

That is in TODO (in the readme) for years and I did not get to it. Will try to prioritize it now when you came here :wink:. Actually feel free to restructure readme to be comprehensible for newcomers - it has never been done and became a pile of mess due to my lack of time.

Also, it would be nice to have contribution instructions so that people could contribute to the project.

Good idea, thanks! Will do.


Btw. I use these layouts (us, de, cz, fi) on all computers I use (at work, at home, at friends places, etc.). They saved me already so much time and trouble.

dumblob commented 1 year ago

Also, it would be nice to have contribution instructions so that people could contribute to the project.

Done :wink: (see https://github.com/dumblob/ULKL#contributions ).

waldyrious commented 1 year ago

Done wink (see dumblob/ULKL#contributions ).

Nice, that was fast!

So, now that I understand this better, I realize it's actually a different (i.e. non-QWERTY) layout! I'll be honest, I'm way too lazy to learn a new layout so it's unlikely that I'll create one for Portuguese sweat_smile

Currently what I do is edit /usr/share/X11/xkb/symbols/pt to make some characters more accessible in my existing layout, by means of AltGr (3rd level) or Shift+AltGr (4th level) combinations, or even replacing the base ones. For example, I have remapped the «/» key that my keyboard has a native key for, with ←/→ as the base/Shift output, and «/» as the AltGr/Shift+AltGr output, and so on.

In any case, I believe these contribution instructions will allow others to contribute, so hopefully this interaction has been a net positive regardless :)

dumblob commented 1 year ago

So, now that I understand this better, I realize it's actually a different (i.e. non-QWERTY) layout! I'll be honest, I'm way too lazy to learn a new layout so it's unlikely that I'll create one for Portuguese :sweat_smile:

Thanks - this is a very important point for me. Up until today I thought that the inaccessibility of certain characters is so much problematic that it easily outweights learning a new layout.

Currently what I do is edit /usr/share/X11/xkb/symbols/pt to make some characters more accessible in my existing layout, by means of AltGr (3rd level) or Shift+AltGr (4th level) combinations, or even replacing the base ones. For example, I have remapped the «/» key that my keyboard has a native key for, with ←/→ as the base/Shift output, and «/» as the AltGr/Shift+AltGr output, and so on.

I see. I am considering using this info to build a portugese layout pord.

In any case, I believe these contribution instructions will allow others to contribute, so hopefully this interaction has been a net positive regardless :)

It definitely is!

A new idea struck me now thanks to your "laziness" :wink: (see, laziness yields innovations!).

How about extending this project to use QWERTY as an alternative "base"?

I chose dvorak based on some wanna-be corpus analysis of German, Finnish, and Czech languages (dvorak was hand-crafted for English based on WW2 corpuses). Dvorak proved as the most universal multilanguage layout having the best ratio between comfort & typing speed (I can write on both QWERTY and dvorak, so I did not care) by a significant margin when compared to others (recent modern layouts as Workman and Halmak though are comparable though).

But it is much more important to offer the multilingual comfort over forcing one to learn another layout (even if it is much better).

Do you think you could create porq (Portugese QWERTY ULKL)?

QWERTY will require more "overloading" as it is not crafted to spread the finger load across fingers (unlike dvorak, halmak, workman, ...) but that is literally nothing compared to needing to use dead characters (i.e. type one or more whole additional strokes which are super difficult to reach to make matters worse) all the time to write a letter with diacritic marks.

waldyrious commented 1 year ago

I would definitely consider using a porq (QWERTY-based) layout, and I'm happy to share the tweaks I have made to my layout. However, as I'm currently fairly happy with my custom layout, I'm not sure I can prioritize the work of creating the full layout. I wonder if there could be an easier way to specify layouts for this project, by just providing the changes from the base, rather than specifying the entire layout. I guess this would be somewhat similar in effect to the way the xkb files are edited, since they inherit from other layouts so one doesn't need to define all keys in a given layout file. Does something like this sound feasible/desirable to you?

(Btw, I should point out that my current tweaks are not that many, and they have been made on an on-demand basis, so the result isn't really a wholly considered layout.)

dumblob commented 1 year ago

Does something like this sound feasible/desirable to you?

That is a very good question. I would very much like to make it easier for others to create ULKL layouts.

Actually the original plan 14 years ago was to prepare all existing layouts in one step and thus no contributions from others would be needed :wink:. That would avoid any such needs. It did not happen apparently - there are only 4 layouts so far :wink: (but I will do a Dutch one soon I think as I am about to start learning Dutch).

Back then I was fed up with this "sharing of pieces" among layouts (inheritance, etc.) because that is full of surprises for end users ("Wtf? I just pressed this and this happened."). So I deliberately crafted existing ULKL layouts in a way it is impossible or highly improbable to produce any other character than the ones defined by ULKL explicitly (see layout "visualizations" at the top of each definition file - e.g. https://github.com/dumblob/ULKL/blob/e867c10778b7fb0b37582274d397dc8f7d2adf2b/platform/x11/symbols/czed#L45-L56 ). Pressing such undefined keys or sequences shall always produce nothing/void.

But as I understand your request, it is basically just about lowering the burden for newcomers to make a ULKL layout.

This though does not necessitate any code sharing on the lowest level. It is enough if there was some automation of the contribution steps I described - e.g. take an existing layout and just remove the layout-specific diacritic marked letters and maybe even pregenerate the whole alphabet as a comment to make it easier for the creator to just take them and assign to logical places.

OTOH reading what I wrote, it is not much work to do all these automatable steps manually. I do not have any QWERTZ baseline (there is only dvorak baseline) but I can create one in no time for you (including a portugese alphabet). May I or did you have something else in mind?

waldyrious commented 1 year ago

As long as adapting the base layout to Portuguese would entail simply copying the QWERTY (or QWERTZ) file and making changes in the keys to be replaced, sure.

But thinking about this some more, I'm not really sure that my tweaks would justify a separate layout. The only keys I have replaced in the base or shift levels are «/», as I mentioned above, and Ins, which I changed to produce a non-dead backtick (`).

The other changes I made are all in the AltGr or Shift+AltGr levels, since I do need the characters in those keys' base and shift levels. For example, I changed the -/_ key to have an em dash (—) and an en dash (–) as its 3rd and 4th level output, respectively, because, well, I do need the - and _ characters. The other changes I have are all like that. So I think those wouldn't fit this project's philosophy of making the characters available only via Shift, right?

dumblob commented 1 year ago

As long as adapting the base layout to Portuguese would entail simply copying the QWERTY (or QWERTZ) file and making changes in the keys to be replaced, sure.

:+1:

So I think those wouldn't fit this project's philosophy of making the characters available only via Shift, right?

Correct. Though with QWERTZ I meant the US QWERTZ and not the Portugese QWERTZ. I also think we can skip certain QWERTZ layout characters in 1st and 2nd levels if there is utter need to do so (especially with punctuation characters - the guiding principle could be looking at the top row of dvorak and all those special characters/symbols could be omitted from the QWERTZ ULKL base 1st & 2nd levels as they are omitted from dvorak ULKL and make them appear only in 3rd & 4th levels as is the case in dvorak ULKL).

Thoughts?

dumblob commented 1 year ago

Btw. regarding em dash and en dash there is a general consensus that either minus (-) is acceptable (see related notes in the readme) in such cases or that SW shall do automated translation of these characters or that other notation is preferred (e.g. double minus will be converted to a given type of dash).

waldyrious commented 1 year ago

Though with QWERTZ I meant the US QWERTZ and not the Portugese QWERTZ.

Why US? And why QWERTZ and not QWERTY? Not that I have strong preferences otherwise, I'm just curious.

I also think we can skip certain QWERTZ layout characters in 1st and 2nd levels if there is utter need to do so (especially with punctuation characters - the guiding principle could be looking at the top row of dvorak and all those special characters/symbols could be omitted from the QWERTZ ULKL base 1st & 2nd levels as they are omitted from dvorak ULKL and make them appear only in 3rd & 4th levels as is the case in dvorak ULKL).

Hm, could you give me an example or two of such symbols that would be moved to 3rs/4th levels?

dumblob commented 1 year ago

Why US?

Well, except for "QWERTZ" layout, all latin-based alphabets have layouts based on US with no changes to alphabet characters (there are sometimes very significant changes to symbols, dead characters, etc. but this is exactly what we want to unify across all latin-based alphabets, so I have to choose something and US is undoubtedly the best supported layout in the world considering QWERTY comes from USA; there is even a small historical relation between US QWERTY and ASCII table :wink:).

And why QWERTZ and not QWERTY? Not that I have strong preferences otherwise, I'm just curious.

That was a typo, of course I meant QWERTY all the time :wink:. As I write dvorak, I needed to look at the (non-dvorak) labels on my physical keyboard to type the sequence. But as it turns out I do not have the US keyboard but a German keyboard here which is QWERTZ and not QWERTY :laughing:.

That is the side effect of writing with a unified principle - one forgets how the physical keyboards look like disregarding which one it is :wink:.

Hm, could you give me an example or two of such symbols that would be moved to 3rs/4th levels?

The czed ULKL layout seems most difficult of all latin-based languages and I consider that to be the "worst case" effectively defining all the physical keys all ULKL layouts will need to sacrifice in 1st & 2nd levels. Those keys are the whole top row and the two rightmost keys on the second from top row on US ANSI physical keyboard.

So the list of sacrificed characters from 1st level (i.e. those moved to 3rd level of other physical keys in ULKL) are: ` 1 2 ... 9 0 [ ] = \

And the list of sacrificed characters from 2nd level (i.e. those moved to 3rd level of the same physical keys in ULKL) are: ~ ! @ # $ % ^ & * ( ) { } + |

So in QWERTY we should sacrifice the very same choice of physical keys (disregarding which characters are there on the US QWERTY).

waldyrious commented 1 year ago

Thanks for the explanations! Ok, so I think I've got a reasonable grasp of what these layouts are supposed to look like. Honestly, I wouldn't be willing to give up convenient access to some keys (like numbers and parenthesis) just to access language-specific diacritics that I can already produce with dead keys plus the base letters. Especially because in Portuguese, the diacritics are almost exclusively placed in the vowels, and there are multiple posssible diacritics per vowel (E has two, and A/O have three — or four if you count ª/º). In contrast, for Czech, looking at the czed layout, it seems that the only base letters that take more than one diacritic are E and U, so the "natural" place for these diacritics don't deviate much from what one might consider intuitive/obvious.

So in Portuguese IMO it makes more sense to use the dead key system to make combinations, rather than hardcode all the diacritical combinations to make them accessible with dedicated keys — doing so would put the accented letters in unintuitive places, and would displace most of the symbols out of the first and second levels. The trade-off doesn't seem to be worth it, IMO. If anything, I would go in the opposite direction (probably because I am a programmer): make symbols that are commonly used in source code, like []{}`^~@, accessible via the first and second levels, although I'm not really sure what I'd sacrifice to do so :thinking: For reference, here's what the physical Portuguese QWERTY layout looks like:

image

And here's what I customized it to be, courtesy of gkbd-keyboard-display, with my changes highlighted:

image

Below are the actual edits I made relative to the default layout that came with my system to produce the custom layout shown above:

image

dumblob commented 1 year ago

So in Portuguese IMO it makes more sense to use the dead key system to make combinations, rather than hardcode all the diacritical combinations to make them accessible with dedicated keys — doing so would put the accented letters in unintuitive places, and would displace most of the symbols out of the first and second levels.

I checked it now and I think this would work pretty well:

à â ã á ç é ê õ ú í ó ô _
    q w e r t y u i o p _ _ _
    a s d f g h j k l _ _
    z x c v b n m _ _ _

(of course, you should now decide which of ó ô õ à â ã á are used most commonly and assign them to the closest position to the finger accordingly; but there could be another exception for ç if it is used vastly less frequently than either of é ê - Portugese corpus or at least some books & newsletter would tell you)

The result is only 4 characters not being written by the same finger (same finger is the single major decisive factor for unification). In czed the result is 5 such characters. So porq (Portugese qwerty) is definitely easier to remember :wink:.

The trade-off doesn't seem to be worth it, IMO. If anything, I would go in the opposite direction (probably because I am a programmer): make symbols that are commonly used in source code, like []{}`^~@, accessible via the first and second levels,

Oh, this is the huge misunderstanding :wink:. The ultimate goal of ULKL is to switch between alphabet layouts for different workflows or parts of text (and not think about the layout differences - that is the power of unified layouts!). Yay, get modal with layouts!

This avoids all the trade-offs you talk about :wink:.

For programming I use engd, for writing in German I use gerd, for unicode writing in German I use gerd but the writer variant, for writing in Czech I use czed etc. When I write paper in Latex, then I mostly use the national layout but for math I switch to engd. Just one or two strokes to switch the layout. Moreover your code editor can switch it also for you automatically when gaining/losing window focus or based on some other triggers (e.g. math mode).

Yes, there are always 2 variants of an ULKL layout - the default and the writer version which offers a few frequent typographical characters on better places. So I think I have you covered with ULKL :smile: .

In practice though I rarely used the writer variant as all writing editors support auto-translation of incorrect sequences to the correct typographical symbols (en dash, em dash, double quotes, french quotes, etc.). Actually I used the writer variant for all the languages a lot in the beginning but it turned out to produce unreadable garbage - especially at the receiving side (e.g. many mail clients still do not support unicode correctly, etc.).

waldyrious commented 1 year ago

Oh, I see, the idea is not to use the same layout all the time, but switch between layouts depending on the usage. I'm still not sure I see myself switching to this system, given that it would require retraining my habits :sweat_smile:

In any case, if it helps, for Portuguese the order would be roughly a > á > ã > à > â and o > ó > õ > ô.