aadcg / emacs-yeis

Yeis's Emacs' Input Switcher
GNU General Public License v3.0
19 stars 2 forks source link

Translation VS Transliteration #2

Open protesilaos opened 4 years ago

protesilaos commented 4 years ago

Hello @aadcg! As far as I can tell, this package will not translate a word from one language to another, but will transliterate it.

Merriam-Webster defines transliterate:

to represent or spell in the characters of another alphabet

So if I type in "greeklish" geia and call yeis-translate-current-word, it will transliterate it into γεια (so it changes the alphabet).

Is this the case?

aadcg commented 4 years ago

Hello @protesilaos!

While you're raising justified concern, your suggestion isn't accurate. I took some time wondering if "translation" is the correct word and settled on it. For me, it still does no justice to what I want to convey. Let's look at this together, shall we?

Firstly, let's see why we can't call it a transliteration. Simply put, take any word in russian and let's assume we're using the usual QWERTY US keyboard layout. For instance, привет transliterates to privet (following the usual transliteration rules between the Cyrillic and Latin alphabets). However, this package switches back and forth between привет and ghbdtn. Here's why.

What induced your error is a peculiarity. Notice that the greek keyboard layout coincides with the transliteration table between the greek and the latin alphabet! I'm not sure if this is true for all letters of the alphabet. You know better than me.

See it another way. If you didn't use the US QWERTY keyboard, but US dvorak, then geia wouldn't transform to γεια.

Let me introduce some a basic concept math concept which will make our life easier (hopefully).

A bijection (also called a one-to-one correspondence) is simply a map between two sets that have the same cardinality (the same number of elements).

Say we have set A = {1, 2} . We could define a bijection f: A -> A such that 1 gets mapped to 2, and 2 gets mapped to 1. Another option would be to send each element to itself (the trivial case).

Notice that a bijection is always invertible. Informally, you can apply the transformation from left-to-right or from right-to-left and you'll still get a map.

With this language, let's see what yeis does.

What are the sets? The sets are composed by the symbols relative to a certain input method (usually lower case and upper case letters of a certain alphabet, punctuation marks, numbers, delimiters, etc).

How do we define the mapping? Well, that depends on what we're trying to achieve. And here lies the generality of yeis too. It can do whatever you want since it's very easy to define this bijection. For instance, we could define a mapping that would simply map the latin letter Y to Z. As you know some people use QWERTZ instead of QWERTY. So yes would be transformed to zes and vice versa. Whether yeis performs the bijection or its inverse (from y -> z or from z -> y) is contingent on the selected input method.

This is the thing we're trying to define - "this going back and forth", irrespective of the direction.

Oddly enough, robin (a built-in Emacs dependency yeis leverages) makes a distinction between applying the bijection or its inverse.

If you go from geia to γεια it's called a conversion, whereas the opposite is called an inversion.

Yeis doesn't make this distinction because this is contingent on the selected input method, thus both are manifestations of the same thing. That explains why calling yeis-translate-current-word with the cursor on the right of γεια and current-input-method being nil results in inaction. Call the function again and the the transformation will happen. Do you understand why? I believe this is the most acceptable behavior.

Going back to the issue at hand. The README says:

Translate text to and from non-CJK input methods;

Which word would you chose? Transformation, perhaps...?

protesilaos commented 4 years ago

Thank you for the detailed explanation!

Since you suggested "transformation", maybe you could write in the README that it is for "translation and transformation", or something along those lines (because it obviously is a fine point, which can be elaborated in a subsequent section).

aadcg commented 4 years ago

What do you make of the following?

1) Transform text as if it had been inserted by any non-CJK input method; 2) Auto set the input method and auto transform text as it's typed (yeis-mode);

A side note. I've been thinking about rewriting this package following a literate programming approach. This is the sort of extension that requires more documentation than elisp code.

As Knuth well said:

Let us change our traditional attitude to the construction of programs. Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do.

protesilaos commented 4 years ago

Yes, the proposed text sounds better.

As for the literate program, how would it work with MELPA, should you ever choose to go down that path?

aadcg commented 4 years ago

Oh, I forgot about that. I have no experience with MELPA honestly. I have to do a bit of research there.

Anyway, yeis needs to firstly prove itself useful for the users.

aadcg commented 4 years ago

@protesilaos, I forgot to mention something.

Due to your observation, I had to rename one of the core functions. yeis-translate-current-word is now yeis-transform-previous-word. Notice the two changes here: translate -> transform and current -> previous. Such a drastic change certainly breaks the configuration users might had before. I'm not sure if I should be managing this in another way.

Please make the necessary changes to your init.el as below after you pull: (global-set-key (kbd "C-|") 'yeis-transform-previous-word)

Regarding the change from current-word to previous-word it makes more sense since it translates the word(s) (if called with a prefix) that come before the cursor. Let the * denote the cursor, and imagine a situation where we have

foo *bar

When yeis-transform-previous-word is called, it translates foo, not bar. The word current certainly misleads the users. Hopefully you have not stepped on this yourself.