Typographic punctuation?

yurikhan commented 7 months ago

What’s your stance on curly quotes, guillemets, en/em dashes and other non-ASCII punctuation? I see the ellipsis … on the top left key ↓ and degree sign ° on bottom right ↗.

MessagEase puts frequently used non-ASCII characters on the swipe-and-return gestures corresponding to punctuation:

char	*	description	key	gesture	base char
`÷`	U+00F7 Division sign	top left	→←	`-`
`¥`	U+00A5 Yen sign	top left	↙↗	`$`
`‘`	*	U+2018 Left single quotation mark	top	↖↘	`
`ˇ`	U+02C7 Caron	top	↑↓	`^`
`’`	*	U+2019 Right single quotation mark	top	↗↙	`´`
`×`	*	U+00D7 Multiplication sign	top	←→	`+`
`¡`	es	U+00A1 Inverted exclamation mark	top	→←	`!`
`–`	*	U+2013 En dash	top	↙↗	`/`
`—`	*	U+2014 Em dash	top	↘↖	`\`
`¿`	es	U+00BF Inverted question mark	top right	←→	`?`
`±`	*	U+00B1 Plus-minus sign	top right	↓↑	`=`
`£`	U+00A3 Pound sign	top right	↘↖	`€`
`‰`	U+2030 Per mille sign	left	↗↙	`%`
`¬`	U+00AC Not sign	left	↘↖	`_`
`¶`	U+00B6 Pilcrow sign	right	↖↘	`\\|`
`ª`	es	U+00AA Feminine ordinal indicator	right	↙↗	`@`
`˜`	U+02DC Spacing tilde	bottom left	↖↘	`~`
`˝`	U+02DD Double acute accent	bottom left	↑↓	`¨`
`‹`	U+2039 Single left-pointing angle quotation mark	bottom left	←→	`<`
`†`	U+2020 Dagger	bottom left	→←	`*`
`“`	*	U+201C Left double quotation mark	bottom	↖↘	`"`
`”`	*	U+201D Right double quotation mark	bottom	↗↙	`'`
`‚`	U+201A Single low-9 quotation mark	bottom	↙↗	`,`
`…`	*	U+2026 Horizontal ellipsis	bottom	↓↑	`.`
`„`	U+201E Double low-9 quotation mark	bottom	↘↖	`:`
`§`	*	U+00A7 Section sign	bottom right	↑↓	`&`
`º`	U+00BA Masculine ordinal indicator	bottom right	↗↙	`°`
`£`	U+00A3 Pound sign	bottom right	←→	`#`
`›`	U+203A Single right-pointing angle quotation mark	bottom right	→←	`>`

I have bolded those I consider essential for day-to-day English usage (the * column carries the same information). Some of the above are specific to Spanish (marked es) and other languages.

Other languages such as German and French also need « (U+00AB Left-pointing double angle quotation mark) and » (U+00BB Right-pointing double angle quotation mark) which MessagEase puts into the compose table (< < compose and > > compose).

I’m not sure spacing diacritics have any use in the absence of a compose mechanism. (Well, except for ` (U+0060 Grave accent) that found its way into programming and markup languages.)

nightkr commented 7 months ago

A PR would be welcome for this, you should be able to add them as shift modifiers to https://github.com/nightkr/flickboard/blob/456973db01b8cae2340a4ba751ab798793512668/app/src/main/java/se/nullable/flickboard/model/layouts/NumMessagEase.kt#L13.

That said, as an aside, I'd question the idea that the distinction between ", “, and ” is critical for typing in english, it's not like your typical QWERTY keyboard has the latter two. Nor do I think anyone is going to notice the difference between a - and a – when reading a text.

yurikhan commented 7 months ago

it's not like your typical QWERTY keyboard has the latter two.

My keyboard is not typical, not QWERTY, and has :)

Nor do I think anyone is going to notice the difference between a - and a – when reading a text.

I do. Even if those reading don’t, I’m uncomfortable writing “simplified” punctuation.

Generally, I consider the decline of typographic punctuation a regression caused by the technical limitations of the typewriter and further propagated by the QWERTY inertia. If we’re chucking the QWERTY legacy, we should be getting rid of these limitations as well.

A PR will follow.

asdkant commented 7 months ago

I agree with the ethos of "we're throwing away the QWERTY legacy so why not ge away from the limitations", but... we DO have limitations here, space is limited. We could add more layers/modes, or add alt layouts for the less commonly used typographic symbols, but I'm not sure how a fundamental change would look like. It's a UX discussion worth having and I'm interested in your thoughts on how this could be achieved.

I have bolded those I consider essential for day-to-day English usage. Some of the above are specific to Spanish and other languages.

With the dark theme it's not noticeable which ones are in bold, you may want to note them differently.

yurikhan commented 7 months ago

Re: limited space

Early typewriter layouts conserved space to the degree that they didn’t have dedicated 1 or 0 digits, on the basis that letters I and O are sufficient to represent those. This mistake was subsequently corrected.

The US ANSI QWERTY computer keyboard has 47 keys dedicated to character input, plus the Space bar. This is exactly sufficient to type the 95 printable ASCII characters using the two gestures that have been available since the typewriter times — standalone and shifted press/release. The UK layout needed one additional non-ASCII character (£); the ISO physical keyboard has one more key (between left Shift and Z). Coincidence? I think not.

The German alphabet has vowels with umlauts and the Eszett letter. The DE variant of QWERTY is typically used over the ISO keyset, puts those additional letters on the [{, ;: and '" keys, moves some punctuation around, and introduces a third gesture, the AltGr modifier key.

So, conjecture: space grows as new needs are discovered.

The space of an ISO keyset with two modifier keys is theoretically 49×4 = 196 characters; AFAIK the Shift+AltGr subspace goes largely unused so the actual use is closer to let’s say about 150.

A MessagEase-like keyboard in the “letters only” configuration gives us 10 keys × 19 gestures (1 tap, 2 circles, 8 linear swipes, 8 swipe-and-returns) = 190 theoretical cap, very close to the full limit of a two-modifier ISO layout. The numeric layer makes typing long numeric sequences easier and gives an escape hatch to access punctuation shadowed by the letter layout.

The original MessagEase implementation also has a compose mechanism that expands the available space even further.

My thought here is that the MessagEase solution to the UX question is close to optimum. As I understand it:

For each national letter layout, we place the the most frequent letters on primary tap, then swipes from the outer keys to center and center to outer keys, then, if necessary, swipes between outer keys.
Circular swipes are reserved for capitals and digits.
Swipe-and-return gestures type the capital version of the letter typed by the corresponding linear swipe.
Punctuation is mostly common to all letter layouts, being inherited from the common numeric layout.
- Frequently used punctuation characters go on swipes between outer keys, and from outer keys outwards.
- Less frequently used punctuation gets the swipe-and-return gesture. The character typed by the extended gesture is somewhat mnemonically tied to the basic character.

One thing where I could see worth deviating from the existing solution is to maybe make quotation marks language-specific. The common thing is that each language has a pair of first-level quotation marks, and if a quote occurs within quoted text then second-level quotation marks are used. The difference is in exact code points used for first and second level. English uses “” ‘’, German „“ ‚‘, French «» “”, Russian «» „“, see Wikipedia for the full story. So I’d maybe reserve four spots in the layout for the paired quotation marks, populate them in the default layout with “” ‘’, and leave a guidance remark for national layout contributors to shadow those with their respective quotation marks.

Oops, that was a bad idea. In English, the closing second level quote doubles as the apostrophe, but in French they are distinct.

nightkr commented 7 months ago

To preempt the discussion here: MessagEase supports them. If someone makes a PR with the ME layout, I'll merge it. If someone comes up with a different symbol layout they prefer, we can make it an option (symbol layouts are already in a separate overlay than numbers ever since #28).

That said. Any attempt to codify a fixed set of letters/symbols is going to lose some flexibility compared to handwriting, and is going to have to make a choice about when different symbols are meaningfully different and when it's "just" an aesthetic rendering choice (or even a pure accident!).

As far as I'm concerned, this is in the same bucket as "should we have separate keys for serif-I, sans serif-I, and a fraktur I?". You might want to render them differently, and that's fine, but it "should be" a property of the font you use, not a fundamental property of the text itself. Just like how some fonts collapse != into a =/= ligature.

yurikhan commented 7 months ago

it "should be" a property of the font you use, not a fundamental property of the text itself

Disagree. It is practically impossible to craft a programmatic rule that will correctly turn '79 into ‘79 as the beginning of a second-level quotation, or into ’79 as the contraction of 1979.

nightkr commented 7 months ago

Holy mother of god I had to zoom in a lot to even see the difference between those two.

nightkr / flickboard

Typographic punctuation? #34