kreativekorp / open-relay

Free and open source fonts from Kreative Software
162 stars 9 forks source link

Request: Sinclair Basic tokens #67

Closed oshaboy closed 1 year ago

oshaboy commented 1 year ago

The ZX80/81 and Spectrum have special character codes in their character sets for entire basic tokens in order to save some of their limited memory. So there were character codes for full strings like IF, FOR, GO TO etc. Of course those can be encoded as just a sequence of ASCII characters, but then there is no way to distinguish between 0xEC and the sequence 0x47 0x4f 0x20 0x54 0x4f 0x20 in the translated output making the 2 way translation impossible.

Many BASIC dialects used the extended characters for tokens, but not in a way so fundamental to the way the Computer displayed characters like the Sinclair machines ZX Spectrum. Even in machine code programs if you tried displaying the character 0xEC on the screen it would instead display the string GO TO. (Edit: Turns out the ZX81 doesn't)

Considering the font supports so many "Legacy Computing" symbols, including ones that aren't supported by unicode. I think this is feasible.

oshaboy commented 1 year ago

To clairfy I understand if the "round trip safe unambiguous translation of basic tokens" is out of scope for the font. I just wanted to ask.

RebeccaRGB commented 1 year ago

The recommended way to represent these are with the ASCII characters and zero-width joiners. So 0xEC would be represented with U+0047+200D+004F+200D+0020+200D+0054+200D+004F. The ZWJs make this round-trippable and since it is zero-width it already appears correctly.

oshaboy commented 1 year ago

I see, thanks for the clarification