Respond to Mark's feedback in L2/18-203

srl295 commented 6 years ago

Document: https://www.unicode.org/L2/L2018/18203-coded-hashes.pdf

srl295 commented 6 years ago

See https://gist.github.com/srl295/0512b4d994d764dcb4c8c90e0543a8cc#file-chai-reply-2018-07-23-txt for my reply.

keithw commented 6 years ago

Huh, I did not ever see this reply from Mark Davis until today.

I think he is confused, unfortunately. :-(

However, there was never a complete story on how this could work in practice. In particular, suppose that a program on a mobile phone gets one of these sequences in an email or other document or text. What does it do with it? Where does it get the image from? Is there a registry somewhere? How does that function? Who maintains it? And so on.

The short answer -- and I thought we said this pretty clearly in our presentation back in May 2016, to which we received no reply whatsoever until now -- is that it would work exactly the same as it does today. Today, vendors announce emoji ZWJ sequences or "colon-codes" on their own initiative, and put them in fonts, and hope that other vendors also implement them in their own fonts. They are not part of the ISO 10646 character set, although a small number of vendors have the ability to post their sequences in non-normative fashion on the Unicode website. (Some sequences are not listed, however, e.g. :neckbeard: on GitHub! :neckbeard: )

In a CHAI world, vendors would have a standardized way of naming arbitrary emoji. They would announce them on their own initiative, and put them in fonts, and hope that other vendors also implement them in their own fonts. They would not be part of the ISO 10646 character set.

The only difference is that in a CHAI world, there would be a standardized, unambiguous way to name an arbitrary emoji, with a canonical glyph representation. Nothing else would have to change -- not the means of distributing the fonts, not the means for vendors to decide which emoji to include in their fonts, etc.

So I think the talk of security concerns or whatnot is totally confused (and was not part of the proposal). No vendor has to try to retrieve an arbitrary CHAI sequence from the Internet, any more than a vendor today has to try to load in an arbitrary glyph from a ZWJ sequence from the Internet.

The talk of a registry is also confused. The point of CHAI is there would be no registry. Any vendor could announce a new emoji and, because they are unambiguous, there would not need to be a deconfliction method. Vendors would still have to decide which emojis to include in their fonts; this would be handled (just as it is today) out of band.

The utility of CHAI is somewhat predicated on the UTC deciding they don't want to be in the business of standardizing individual emoji in the ISO 10646 character set (and, to a letter degree, on their website). If the Unicode Consortium is happy with the status quo and doesn't think they have a problem, I don't think they need CHAI -- to bastardize Upton Sinclair, you can't persuade a man that he shouldn't be a gatekeeper if his paycheck is coming from being a gatekeeper! But I don't want to see confusion linger about what is really a pretty simple proposal (that we made two years ago to essentially zero reply until now).

Do you think we should write up a reply document here...?

srl295 commented 6 years ago

@keithw I did write up a short reply document and submitted it above. ^ I linked it to here.

zero feedback until now

There was https://www.unicode.org/L2/L2016/16379-hash-fdbk.html which I opened #10 for - and put some responses there.

Today, vendors announce emoji ZWJ sequences or "colon-codes" on their own initiative, and put them in fonts, and hope that other vendors also implement them in their own fonts

Yeah. and the point of the proposal was to move this OUT of UTC and to better reflect the current process, as well as to allow decentralized experimentation.

The talk of a registry is also confused. The point of CHAI is there would be no registry.

It's already somewhat decentralized (in that vendors can support or not support an item.)

If the Unicode Consortium is happy with the status quo and doesn't think they have a problem, I don't think they need CHAI

I think that's kind of what we have learned over the past couple of years. At least there's something out there that can be built upon.

srl295 commented 6 years ago

@keithw this was Posted as https://www.unicode.org/L2/L2018/18251-chai-reply.txt

keithw commented 6 years ago

Well, okay. I think it's one thing if the UTC decides they are fine with the status quo. I may disagree with them that this is an appropriate role for a technical committee that, until this point, has not tried to be (and is not qualified to be) the arbiters of a new lexicon. But if the UTC doesn't think they have a problem, and is getting a lot of outside interest and money from this activity, I'm not surprised we would have trouble selling them a solution. (The reason I participated in this proposal back in 2016 was that you are on the UTC and we seemed to agree on the existence of a problem -- I think I assumed or at least hoped others on the UTC shared your view!)

It's another thing, though, to wait two years and then raise spurious security concerns or suggest that the proposal is a lot more complicated than it really is. In my view, CHAI is basically just a standardized and decentralized way for vendors to name what today we're calling Emoji ZWJ sequences. It doesn't solve any of the issues of emoji distribution to fonts and keyboards (just like today) but it gets the Unicode Consortium out of the business of hosting a list of ZWJ sequences from privileged vendors. From that perspective, we're talking about a small change that solves a small part of the problem.

srl295 commented 6 years ago

@keithw I disagree also with the emphasis on central encoding and will continue to say so. It seems that raising the technical means of encoding to the level of a mention is as far as the proposal gets without any kind of vendor support.

srl295 commented 6 years ago

@keithw perhaps an avenue would be components such as emoji-mart which support custom emoji. Rather than encoding something directly in Unicode, a "repertoire" could be a map from:

colon name	sha-256
`:chai:`	37d8c5403d29ec7d6f59b02690414de7

… then :chai: in some implementation could refer to the hash given, and locate the image via decentralized methods…

srl295 / srl-unicode-proposals

Respond to Mark's feedback in L2/18-203 #14