microsoft / vscode

Visual Studio Code
https://code.visualstudio.com
MIT License
161.77k stars 28.44k forks source link

Quick pick: Support alternate representation of French/European characters #196733

Open Tyriar opened 10 months ago

Tyriar commented 10 months ago

This should also match tache:

Image

See https://github.com/microsoft/vscode/issues/196203 which this would be built upon.

cc @aiday-mar

aiday-mar commented 10 months ago

For French I'd say, we can draw the following similarities:

â, à -> a ô -> o ï, î -> i é, è, ê, ë -> e ç -> c œ -> oeu ù, û -> u

Oftentimes for speed of writing, I think people do not type the accents.

aiday-mar commented 10 months ago

For Russian, I would say the following are similar, and can be interchanged for quicker typing or by mistake

ё, е щ, ш ъ, ь й, и

cc @ulugbekna do you think maybe something should be added to this list, or some pairs are not sufficiently close?

rzhao271 commented 10 months ago

For Chinese, it seems like the IDE already suggests English words for you I'm unable to get a good example because the IME on my Windows device is busted and Google Pinyin doesn't seem to exist as a standalone app anymore, so here's an example picture of a mobile IME suggesting the English string "yk" along with certain characters:

Cangjie mobile example

Tyriar commented 10 months ago

@rzhao271 I think for Chinese and Japanese Kanji (Hiragana and Katakana could work) this isn't really possible due to the number of characters. It would make sense to translate input characters in a desktop Chinese IME into the qwerty alternatives, in case you accidentally started typing when in the wrong keyboard mode, but again we can only really map a handful of characters and/or a unicode code range.

rzhao271 commented 10 months ago

What I'm saying is that Chinese and Japanese should be out of scope anyway with a good IME because all Chinese IMEs I used suggest English words for you if you type out an English word even while in Chinese input mode.

TylerLeonhardt commented 10 months ago

Calling out what @jrieken mentioned in slack:

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Intl/Collator/Collator#options

We may be able to use this for the simpler cases

Tyriar commented 10 months ago

Yep I think Collator will cover the latin cases, not sure about Cyrillic

image image

rzhao271 commented 10 months ago

For Russian, I've heard of ё being written as just е, but I'm unsure about the rest. I'm unsure if it's because of the https://en.wikipedia.org/wiki/JCUKEN layout and how ё is way off in the corner. 'ё'.localeCompare('е', undefined, { sensitivity: 'accent' }); returns 1 and 'ё'.localeCompare('е', 'ru', { sensitivity: 'accent' }); also returns 1, though.