picocomputer / rp6502

Picocomputer 6502 firmware
BSD 3-Clause "New" or "Revised" License
85 stars 23 forks source link

Support non-US keyboards #8

Open rumbledethumps opened 1 year ago

rumbledethumps commented 1 year ago

The very simple HID_KEYCODE_TO_ASCII table from TinyUSB is currently used. We can do better.

One approach is to create HID_KEYCODE_TOUNICODE(LOCALE) tables. Then use FatFs ff_uni2oem() to convert unicode to the current code page.

ulften commented 1 year ago

I can try to help with testing this, but will in turn need help setting up a test environment. I have not yet built a RIA or VGA image myself. /Ulf

Technikfreak2002 commented 1 year ago

To do something useful i made a german keymap. But only ASCII chars.

#define HID_KEYCODE_TO_GR    \
    {0     , 0      }, /* 0x00 */ \
    {0     , 0      }, /* 0x01 */ \
    {0     , 0      }, /* 0x02 */ \
    {0     , 0      }, /* 0x03 */ \
    {'a'   , 'A'    }, /* 0x04 */ \
    {'b'   , 'B'    }, /* 0x05 */ \
    {'c'   , 'C'    }, /* 0x06 */ \
    {'d'   , 'D'    }, /* 0x07 */ \
    {'e'   , 'E'    }, /* 0x08 */ \
    {'f'   , 'F'    }, /* 0x09 */ \
    {'g'   , 'G'    }, /* 0x0a */ \
    {'h'   , 'H'    }, /* 0x0b */ \
    {'i'   , 'I'    }, /* 0x0c */ \
    {'j'   , 'J'    }, /* 0x0d */ \
    {'k'   , 'K'    }, /* 0x0e */ \
    {'l'   , 'L'    }, /* 0x0f */ \
    {'m'   , 'M'    }, /* 0x10 */ \
    {'n'   , 'N'    }, /* 0x11 */ \
    {'o'   , 'O'    }, /* 0x12 */ \
    {'p'   , 'P'    }, /* 0x13 */ \
    {'q'   , 'Q'    }, /* 0x14 */ \
    {'r'   , 'R'    }, /* 0x15 */ \
    {'s'   , 'S'    }, /* 0x16 */ \
    {'t'   , 'T'    }, /* 0x17 */ \
    {'u'   , 'U'    }, /* 0x18 */ \
    {'v'   , 'V'    }, /* 0x19 */ \
    {'w'   , 'W'    }, /* 0x1a */ \
    {'x'   , 'X'    }, /* 0x1b */ \
    {'z'   , 'Z'    }, /* 0x1c  GER */ \
    {'y'   , 'Y'    }, /* 0x1d  GER */ \
    {'1'   , '!'    }, /* 0x1e */ \
    {'2'   , '"'    }, /* 0x1f  GER */ \
    {'3'   , 0      }, /* 0x20  GER  '§' */ \
    {'4'   , '$'    }, /* 0x21 */ \
    {'5'   , '%'    }, /* 0x22 */ \
    {'6'   , '&'    }, /* 0x23  GER */ \
    {'7'   , '/'    }, /* 0x24  GER */ \
    {'8'   , '('    }, /* 0x25  GER */ \
    {'9'   , ')'    }, /* 0x26  GER */ \
    {'0'   , '='    }, /* 0x27  GER */ \
    {'\r'  , '\r'   }, /* 0x28 */ \
    {'\x1b', '\x1b' }, /* 0x29 */ \
    {'\b'  , '\b'   }, /* 0x2a */ \
    {'\t'  , '\t'   }, /* 0x2b */ \
    {' '   , ' '    }, /* 0x2c */ \
    {0     , '?'    }, /* 0x2d  GER  'ß' */ \
    {0     , '`'    }, /* 0x2e  GER  '´' */ \
    {'u'   , 'U'    }, /* 0x2f  GER  'ü' + 'Ü'*/ \
    {'+'   , '*'    }, /* 0x30  GER */ \
    {'#'   , '#'    }, /* 0x31  GER */ \
    {'#'   , '~'    }, /* 0x32 */ \
    {'o'   , 'O'    }, /* 0x33  GER  'ö' + 'Ü'*/ \
    {'a'   , 'A'    }, /* 0x34  GER  'ä' + 'Ü'*/ \
    {'^'   , 0      }, /* 0x35  GER  '°' */ \
    {','   , ';'    }, /* 0x36  GER */ \
    {'.'   , ':'    }, /* 0x37  GER */ \
    {'-'   , '_'    }, /* 0x38  GER */ \
                                  \
    {0     , 0      }, /* 0x39 */ \
    {0     , 0      }, /* 0x3a */ \
    {0     , 0      }, /* 0x3b */ \
    {0     , 0      }, /* 0x3c */ \
    {0     , 0      }, /* 0x3d */ \
    {0     , 0      }, /* 0x3e */ \
    {0     , 0      }, /* 0x3f */ \
    {0     , 0      }, /* 0x40 */ \
    {0     , 0      }, /* 0x41 */ \
    {0     , 0      }, /* 0x42 */ \
    {0     , 0      }, /* 0x43 */ \
    {0     , 0      }, /* 0x44 */ \
    {0     , 0      }, /* 0x45 */ \
    {0     , 0      }, /* 0x46 */ \
    {0     , 0      }, /* 0x47 */ \
    {0     , 0      }, /* 0x48 */ \
    {0     , 0      }, /* 0x49 */ \
    {0     , 0      }, /* 0x4a */ \
    {0     , 0      }, /* 0x4b */ \
    {0     , 0      }, /* 0x4c */ \
    {0     , 0      }, /* 0x4d */ \
    {0     , 0      }, /* 0x4e */ \
    {0     , 0      }, /* 0x4f */ \
    {0     , 0      }, /* 0x50 */ \
    {0     , 0      }, /* 0x51 */ \
    {0     , 0      }, /* 0x52 */ \
    {0     , 0      }, /* 0x53 */ \
                                  \
    {'/'   , '/'    }, /* 0x54 */ \
    {'*'   , '*'    }, /* 0x55 */ \
    {'-'   , '-'    }, /* 0x56 */ \
    {'+'   , '+'    }, /* 0x57 */ \
    {'\r'  , '\r'   }, /* 0x58 */ \
    {'1'   , 0      }, /* 0x59 */ \
    {'2'   , 0      }, /* 0x5a */ \
    {'3'   , 0      }, /* 0x5b */ \
    {'4'   , 0      }, /* 0x5c */ \
    {'5'   , '5'    }, /* 0x5d */ \
    {'6'   , 0      }, /* 0x5e */ \
    {'7'   , 0      }, /* 0x5f */ \
    {'8'   , 0      }, /* 0x60 */ \
    {'9'   , 0      }, /* 0x61 */ \
    {'0'   , 0      }, /* 0x62 */ \
    {'.'   , 0      }, /* 0x63 */ \
    {'<'   , '>'    }, /* 0x64  GER */ \
    {0     , 0      }, /* 0x65 */ \
    {0     , 0      }, /* 0x66 */ \
    {'='   , '='    } /* 0x67 */ \

To use this map, copy this define into RIA's hid.h and change line 41 in RIA's hid.c from: KEYCODE_TO_ASCII[128][2] = {HID_KEYCODE_TO_ASCII}; to: KEYCODE_TO_ASCII[128][2] = {HID_KEYCODE_TO_GR};

I will make a HID to unicode for german keyboards soon.

I hope this is helpful for someone. :P

Technikfreak2002 commented 1 year ago

German keymap with UNICODE\UTF8 chars. But it does not work. Maybe it can be used as a starting point for UNICODE support.

#define HID_KEYCODE_TO_GR    \
    {0     , 0      }, /* 0x00 */ \
    {0     , 0      }, /* 0x01 */ \
    {0     , 0      }, /* 0x02 */ \
    {0     , 0      }, /* 0x03 */ \
    {'a'   , 'A'    }, /* 0x04 */ \
    {'b'   , 'B'    }, /* 0x05 */ \
    {'c'   , 'C'    }, /* 0x06 */ \
    {'d'   , 'D'    }, /* 0x07 */ \
    {'e'   , 'E'    }, /* 0x08 */ \
    {'f'   , 'F'    }, /* 0x09 */ \
    {'g'   , 'G'    }, /* 0x0a */ \
    {'h'   , 'H'    }, /* 0x0b */ \
    {'i'   , 'I'    }, /* 0x0c */ \
    {'j'   , 'J'    }, /* 0x0d */ \
    {'k'   , 'K'    }, /* 0x0e */ \
    {'l'   , 'L'    }, /* 0x0f */ \
    {'m'   , 'M'    }, /* 0x10 */ \
    {'n'   , 'N'    }, /* 0x11 */ \
    {'o'   , 'O'    }, /* 0x12 */ \
    {'p'   , 'P'    }, /* 0x13 */ \
    {'q'   , 'Q'    }, /* 0x14 */ \
    {'r'   , 'R'    }, /* 0x15 */ \
    {'s'   , 'S'    }, /* 0x16 */ \
    {'t'   , 'T'    }, /* 0x17 */ \
    {'u'   , 'U'    }, /* 0x18 */ \
    {'v'   , 'V'    }, /* 0x19 */ \
    {'w'   , 'W'    }, /* 0x1a */ \
    {'x'   , 'X'    }, /* 0x1b */ \
    {'z'   , 'Z'    }, /* 0x1c  GER */ \
    {'y'   , 'Y'    }, /* 0x1d  GER */ \
    {'1'   , '!'    }, /* 0x1e */ \
    {'2'   , '"'    }, /* 0x1f  GER */ \
    {'3'   , '\xa7' }, /* 0x20  GER  '§' */ \
    {'4'   , '$'    }, /* 0x21 */ \
    {'5'   , '%'    }, /* 0x22 */ \
    {'6'   , '&'    }, /* 0x23  GER */ \
    {'7'   , '/'    }, /* 0x24  GER */ \
    {'8'   , '('    }, /* 0x25  GER */ \
    {'9'   , ')'    }, /* 0x26  GER */ \
    {'0'   , '='    }, /* 0x27  GER */ \
    {'\r'  , '\r'   }, /* 0x28 */ \
    {'\x1b', '\x1b' }, /* 0x29 */ \
    {'\b'  , '\b'   }, /* 0x2a */ \
    {'\t'  , '\t'   }, /* 0x2b */ \
    {' '   , ' '    }, /* 0x2c */ \
    {'\xdf', '?'    }, /* 0x2d  GER  'ß' */ \
    {'\xb4', '`'    }, /* 0x2e  GER  '´' */ \
    {'\xfc', '\xdc' }, /* 0x2f  GER  'ü' + 'Ü'*/ \
    {'+'   , '*'    }, /* 0x30  GER */ \
    {'#'   , '#'    }, /* 0x31  GER */ \
    {'#'   , '~'    }, /* 0x32 */ \
    {'\xf6', '\xd6' }, /* 0x33  GER  'ö' + 'Ü'*/ \
    {'\xe4', '\xc4' }, /* 0x34  GER  'ä' + 'Ü'*/ \
    {'^'   , '\xb0' }, /* 0x35  GER  '°' */ \
    {','   , ';'    }, /* 0x36  GER */ \
    {'.'   , ':'    }, /* 0x37  GER */ \
    {'-'   , '_'    }, /* 0x38  GER */ \
                                  \
    {0     , 0      }, /* 0x39 */ \
    {0     , 0      }, /* 0x3a */ \
    {0     , 0      }, /* 0x3b */ \
    {0     , 0      }, /* 0x3c */ \
    {0     , 0      }, /* 0x3d */ \
    {0     , 0      }, /* 0x3e */ \
    {0     , 0      }, /* 0x3f */ \
    {0     , 0      }, /* 0x40 */ \
    {0     , 0      }, /* 0x41 */ \
    {0     , 0      }, /* 0x42 */ \
    {0     , 0      }, /* 0x43 */ \
    {0     , 0      }, /* 0x44 */ \
    {0     , 0      }, /* 0x45 */ \
    {0     , 0      }, /* 0x46 */ \
    {0     , 0      }, /* 0x47 */ \
    {0     , 0      }, /* 0x48 */ \
    {0     , 0      }, /* 0x49 */ \
    {0     , 0      }, /* 0x4a */ \
    {0     , 0      }, /* 0x4b */ \
    {0     , 0      }, /* 0x4c */ \
    {0     , 0      }, /* 0x4d */ \
    {0     , 0      }, /* 0x4e */ \
    {0     , 0      }, /* 0x4f */ \
    {0     , 0      }, /* 0x50 */ \
    {0     , 0      }, /* 0x51 */ \
    {0     , 0      }, /* 0x52 */ \
    {0     , 0      }, /* 0x53 */ \
                                  \
    {'/'   , '/'    }, /* 0x54 */ \
    {'*'   , '*'    }, /* 0x55 */ \
    {'-'   , '-'    }, /* 0x56 */ \
    {'+'   , '+'    }, /* 0x57 */ \
    {'\r'  , '\r'   }, /* 0x58 */ \
    {'1'   , 0      }, /* 0x59 */ \
    {'2'   , 0      }, /* 0x5a */ \
    {'3'   , 0      }, /* 0x5b */ \
    {'4'   , 0      }, /* 0x5c */ \
    {'5'   , '5'    }, /* 0x5d */ \
    {'6'   , 0      }, /* 0x5e */ \
    {'7'   , 0      }, /* 0x5f */ \
    {'8'   , 0      }, /* 0x60 */ \
    {'9'   , 0      }, /* 0x61 */ \
    {'0'   , 0      }, /* 0x62 */ \
    {'.'   , 0      }, /* 0x63 */ \
    {'<'   , '>'    }, /* 0x64  GER */ \
    {0     , 0      }, /* 0x65 */ \
    {0     , 0      }, /* 0x66 */ \
    {'='   , '='    } /* 0x67 */ \
rumbledethumps commented 1 year ago

@Technikfreak2002, I can use this to build out the next layer. Isn't GR typically used for Greece? I thought Germany was DE (Deutschland).

Technikfreak2002 commented 1 year ago

I depends. When you talk about the locale, it is DE. But when you set the layout, it is GR. Like "keyb gr" from old MS-DOS. Don't know why.

rumbledethumps commented 1 year ago

Also, what about the AltGr key? Don't you need that for full ASCII (like @)?

Technikfreak2002 commented 1 year ago

Yes this key is also needed. For al least these {} [] \ ~ | @ The € ² ³ and µ also need AltGr. But i think the alt modifier needs a different table. I modified the layout because i did not own a US keyboard. and it is quite difficult if your layout doesn't match :D And it's useable so i shared it. :P

rumbledethumps commented 1 year ago

I added a column for AltGr and send everything through the unicode translation. It looks like dead keys are needed too. I don't know what else is needed and don't have time to do the research. If someone wants to work on this to completion, build it around a #define and I'll add a setting for it.

rumbledethumps commented 1 year ago

I added a CMakeLists.txt option to help collect the basic unicode tables. Once we have the full requirements, a config format should emerge. 65dd4999deb97385b53145a356b38e754688768c

ulften commented 10 months ago

I experimented with the Swedish keyboard and made a kbd_sv,h file. The issue that I see is that while keyboard mappings work for character values below 128, everything between 128 and 255 (or -127 to -1) does not print. I noticed several places in kbd.c were a char type is used and I suspected that unsigned char would be needed. But even after changing those that I found, it still did not work. The problem might be higher up the stack in the typedef struct stdio_driver stdio_driver_t , but I might also be barking up the entirely wrong tree. kbd.c.txt kbd_sv.h.txt

rumbledethumps commented 10 months ago

Ö is unicode 0x00D6 but you have 0x0099 which is Ö in cp850. Also, SV is El Salvador. Please use ISO 3166-1 alpha-2 edit: you changed my mind, use ISO 639-2 codes.

rumbledethumps commented 10 months ago

I was thinking some more about ISO 3166 (country codes) and came across ISO 639 (language codes) which makes more sense for keyboards. In that case SV means swedish/svenska (not Sweden/Sverige). The thing about standards is that there so many to choose from. https://xkcd.com/927/

ulften commented 10 months ago

I was feeling 30 years out-of-date after your first message for not keeping up with ISO standards and for not fully grasping unicode. After your 2nd message I can focus on catching up on unicode :-) and remain lost in the ISO standards for now. The cartoon summed it up nicely!

ulften commented 10 months ago

Thanks to your help, I am getting close to a working Swedish keyboard now. åÅäÄöÖ and most other keys are working. The supposedly silent keys are of course not so silent and are printing without waiting for the next char, i.e.: é'eë are printed directly without waiting for the 'e' (which I guess is known), but otherwise it seems OK.

On the other hand, strings in the code when printed, for instance: "These are the special Swedish characters: åÅ, äÄ, öÖ" do not come out right on VGA, only on minicom. I guess we still need to fix unicode for the VGA display

I changed the file name to kbd_swe.h in accordance with 639-2, but have made the defines to accept both 'swe' and 'se' as keyboard settings since 639-2 mentions that 639-1 and 639-2 should be considered synonyms. kbd_swe.h.txt

ulften commented 10 months ago

not 'se' - I meant 'sv'

rumbledethumps commented 10 months ago

6502 applications and the terminal use the code page. Unicode is too much for 8-bit 64k systems. The idea is an app checks codepage() and faults if an unsupported one is selected. Or an app can do it all itself with direct keyboard and character graphics.

Dead keys (silent?) aren't yet implemented. We don't have those here. I have to research its nuances unless someone does it.

It's checked in. Simply change RP6502_KEYBOARD in CMakeLists.txt until we have a complete solution.

rumbledethumps commented 10 months ago

Here's a list of what Windows uses: https://ss64.com/locale.html I think we're on the right track now. Turns out you sometimes have to use both 639 and 3166 like EN_US and EN_UK.