39aldo39 / klfc

Keyboard Layout Files Creator
GNU General Public License v3.0
215 stars 13 forks source link

Support U+1234 notation for Unicode symbols? #24

Closed kindaro closed 2 years ago

kindaro commented 4 years ago

It is customary to denote Unicode characters by U+[character code], but it seems klfc does not support this notation:

% klfc --from-json x --xkb y
klfc: parse fail in x: Error in $.keys[0].letters[0]: ‘U+2002’ is not a valid letter

Note that the character in this example is a kind of a space, so I would like not to insert it verbatim: it will hardly be clear for the reader what kind of space it is.

I also propose that there were a flag that allows writing json files with U+... format. Most fonts only support a narrow range of characters, so in many cases the more unusual characters would not show in any meaningful way.

39aldo39 commented 4 years ago

KLFC uses a normal JSON file, so you can use the syntax "\u2002" already.

kindaro commented 4 years ago

I find it to be somewhat «wrong» to let the format of serialization define the ways in which I may or may not define a symbol, for the following reasons:

I am sure this change is technically feasible. If someone were to make it, would you merge?

39aldo39 commented 4 years ago

I don't think it necessarily wrong to let the format decide it, but I understand that the intention may be lost as most parsers throw away that information. However, it is also not very elegant to basically make your own escape sequences. For example, ligatures can currently be written as lig:U+2002, which outputs the literal string "U+2002". If you would also allow other notations, this becomes ambiguous. I don't know a nice solution for that.

39aldo39 commented 2 years ago

I have now added explicit support for Unicode characters!

kindaro commented 2 years ago

I do observe that this feature works.

If in the future we want to make sure ligatures can include strings that resemble the notation for Unicode code points, we can allow for the specification of a key to include not only strings, but objects like {"type": "ligature", "contents": "U+2002"} which would output the literal string "U+2002", and even {"type": "ligature", "contents": ["U+2245", " is for isomorphose"]]} which would output the literal string "≅ is for isomorphose".