dsw / proquint

Proquints: Identifiers that are Readable, Spellable, and Pronounceable.
http://github.com/dsw/proquint
Other
289 stars 22 forks source link

Feature: Proquint for other European languages #12

Open DonaldTsang opened 5 years ago

DonaldTsang commented 5 years ago

Some design choices:

  1. The sounds from these languages should fit similarly to the English version
    • If it is impossible, replace the sound that only exist in English with a unqiue soud in the other language
  2. Each sound should be one character in Unicode only (so no double consonants and double vowels)
  3. Sound changes due to being in front of the word or back of the word, needs to be accounted for
dsw commented 5 years ago

This is not going to work. What you propose is a severe instance of "creeping feature-ism": https://en.wikipedia.org/wiki/Feature_creep

The whole point of Proquints is to be first and foremost an encoding for computer data, and, as such, we do not want it changing between localities, just as we do not change (substantially) the notation for numerical digits between localities. That is, while we want to encode data in a way that is both (1) computer-friendly and (2) human-friendly, purpose (1) has priority over purpose (2). (It must be this way, as both purposes cannot be simultaneously satisfied as much as people might want.) As such, Proquints must be encoded/decoded from binary in a universal, simple, and efficient manner. Proquints are expressed in manageable chunks of information that can be easily said (somehow) and are therefore easily remembered, but only to the degree that the first purpose, that of a canonical data encoding, not be compromised.

Proquints is not the International Phonetic Alphabet. Like it or not, the universal format of computer data is Latin/ASCII alphabet. Further, we do not have universal pronunciation even within English, much less across all languages: I once met An American, an Australian, and a Brit who had such heavy accents that the Brit had to translate between the other two in English. Attempting to cater to more languages in any way whatsoever is unworkable, even ones having Latin alphabets. For example, in Hungarian "s" is pronounced like English "sh" and Hungarian "sz" is pronounced like English "s"; even crazier, Hungarian "dzs" is pronounced like English "j" (as in "jungle", which Hungarians actually spell starting with "dzs"). Therefore user(s) of Proquints must choose a pronunciation if Proquints are to be spoken between them, or even just if remembered verbally (rather than visually) by a single person over time.

Again, I never said how you are to pronounce the given letters; you really just need a distinct and context-free pronunciation for each. I recommend the Italian pronunciation for the vowels, as singers use when singing, since they are pure and distinct and sound nice. For the consonants, I recommend picking something simple for you, such as perhaps an exaggerated pronunciation of American English, and checking that you have chosen a distinct and context-free pronunciation for each letter. Again, remember to pronounce the letters in a context-free manner, independent of any special rules for words that may arise: for example, "laser" is "la s er", not "laZ er".

DonaldTsang commented 5 years ago

@dsw in that case, is it possible to create one for just Greek or Cyrillic alphabets, where the characters are clearly separate from Latin alphabet? The reason I do find this feature "useful" is that I am Asian with Cantonese/English as first language, and that I believe that the more types of languages that are included into binary encodings, the more useful it is to the general audience. If I am allowed to be idiotic, I would also include Hangul and Kana, but since Hangul is CVC and Kana is CV only I don't think it can map well.

For Cyrillic, these conlangs can give some pointers

dsw commented 5 years ago

I am not about to do this. If you do, please do not call it Proquints, but if it is the same idea, then in your paper please do provide a citation to the original Proquints paper.

On Sat, Jul 20, 2019 at 6:59 PM Donald Tsang notifications@github.com wrote:

@dsw https://github.com/dsw in that case, is it possible to create one for just Greek or Cyrillic alphabets, where the characters are clearly separate from Latin alphabet? The reason I do find this feature "useful" is that I am Asian with Cantonese/English as first language, and that I believe that the more types of languages that are included into binary encodings, the more useful it is to the general audience. If I am allowed to be idiotic, I would also include Hangul and Kana, but since Hangul is CVC and Kana is CV only I don't think it can map well.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dsw/proquint/issues/12?email_source=notifications&email_token=AAAL6XGRQ6GCRGI6S6ZIE53QAO7ILA5CNFSM4IEWI6C2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2NZJ6A#issuecomment-513512696, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAL6XB5ZOJEQ57DTVYOXO3QAO7ILANCNFSM4IEWI6CQ .

DonaldTsang commented 5 years ago

@dsw it is the same idea, but I would make it compatible with Proquints as a hobby/side project. I am in no way academically accredited to do papers, I am a mere student who cares about accessibility. Is "Proquints" the name under trademark or whatever? I don't see why I can't call it "Proquints-based".

dsw commented 5 years ago

There is no such thing as being "academically accredited to do papers". If you make a version of Proquints for other languages/alphabets/writing-systems,

(1) do not call it Proquints and

(2) provide a citation/link back to the original Proquints paper and github repo: https://arxiv.org/html/0901.4016 and https://github.com/dsw/proquint

On Mon, Jul 22, 2019 at 4:05 AM Donald Tsang notifications@github.com wrote:

@dsw it is the same idea, but I would make it compatible with Proquints as a hobby/side project. I am in no way academically accredited to do papers, I am a mere student who cares about accessability.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

DonaldTsang commented 5 years ago

I would name it "propente" for Greek and "propyat"/"propet" for Cyrillic for the distinction.

There is no such thing as being "academically accredited to do papers".

But there is, not everyone can publish papers in Universities or Journals, especially for people who are self-taught.

DonaldTsang commented 5 years ago

Here is my first draft of the Propyat Consonants="БДФГХЖКЛМНПРСТВЗ" Vowels = "АИОУ"