olikraus / u8g2

U8glib library for monochrome displays, version 2
Other
4.91k stars 1.02k forks source link

How to print accented characters with U8g2lib in character mode ? #2443

Closed Herwig9820 closed 1 month ago

Herwig9820 commented 1 month ago

Hi, I'm using OLED displays with your U8g2lib. Great library, but I cannot print accented characters (é, à, ...).

#define U8LOG_WIDTH 16                          // 16 characters wide
#define U8LOG_HEIGHT 8                          // 8 lines
U8X8_SSD1306_128X64_NONAME_HW_I2C u8x8_i2c;
U8X8LOG u8x8log_i2c;                                           // create object implementing text window with automatic vertical scrolling
uint8_t u8log_buffer_i2c[U8LOG_WIDTH * U8LOG_HEIGHT];          // allocate memory for display (width x height characters)

void setup() {
    ... 
    u8x8_i2c.begin();                                                               // initialize OLED object                                                                   
    u8x8_i2c.setFont(u8x8_font_chroma48medium8_r);                                  // set font
    u8x8log_i2c.begin(u8x8_i2c, U8LOG_WIDTH, U8LOG_HEIGHT, u8log_buffer_i2c);       // initialize OLED text window object, connect to U8x8, set character size and assign memory
    u8x8log_i2c.setRedrawMode(0);                                                   // set the U8x8log redraw mode. 0: Update screen with newline, 1: Update screen for every char
    ...
}

I found an u8lib issue '452', for which you provided a solution, for graphic mode. I tried various ways to do something similar (u8g2lib, character mode), but I couldn't find an 'enableUTF8Print' method for the objects I created. I'm probably doing something wrong. Could you help me with that ?

Many thanks Herwig

olikraus commented 1 month ago

You need to check, whether the required character is available in the selected font. In your example the selected font doesn't contain the desired character: https://github.com/olikraus/u8g2/wiki/fntgrpopengameart#chroma48medium8

I suggest to use this font instead: https://github.com/olikraus/u8g2/wiki/fntgrpcodeman38#pressstart2p

Herwig9820 commented 1 month ago

OK, in the code snippet above, I changed the font, as folllows:

    //u8x8_i2c.setFont(u8x8_font_chroma48medium8_r);                              
    u8x8_i2c.setFont(u8x8_font_pressstart2p_f);                                     

Looking at the character map, I can verify that the characters printed are according to the character codes: example: when printing 'char(224)', it prints 'à', which is correct.

But when printing "à", it prints "Ä " (wrong character, followed by a space), which is not correct. Check: printing asc('à) returns 195. The character map of the font indeed indicates Ä for character code 195.

So, Arduino does not translate a character to the character code expected by the OLED. Example: it sends character code 195, and not 224, when printing 'à'.

Does that mean I'm again using a wrong font ? I would again appreciate your help.

Best regards Herwig

olikraus commented 1 month ago

The short suggestion is: Read out the hex code from the table. For Ä this would be 0xc4. Hex codes can be embedded into strings, by using \x escape sequence: "\xc4pfel" would print the German word "Äpfel" (apples).

Long answer: If you enter "Ä" (or any other char) into your keyboard, then:

  1. your text editor translates the key into an encoded which is used / configured in your text editor. This might be UTF-8 in case of Arduino IDE. The UTF8 encoding for Ä is "0xC3 0xA4", which as nothing todo with 0xC4
  2. Once your c-code contains the byte sequence "0xC3 0xA4" (remember you will always see Ä because your editor is configured for UTF-8), then the C-Compiler has to understand what exactly is "0xC3 0xA4". Usually it doesn't do much, but some remarks are here: https://gcc.gnu.org/onlinedocs/gcc-4.0.4/cpp/Character-sets.html
  3. Finally the software (in this case U8g2 / U8x8) has to decode the encoded character and fetch the correct glyph from the font. In this case, u8x8 and u8g2 have to assume, what input encoded you might have, because there is no way to tell u8g2 / u8x8 how your editor is configured.

Now, this is what u8g2 and u8x8 will do: Both will assume ISO/IEC 8859 encoding. U8g2 (as a special feature) also supports UTF8 if enabled (https://github.com/olikraus/u8g2/wiki/u8g2reference#enableutf8print) or if you use the drawUTF8 function.

U8x8 has a similar function, but my suggestion is not to trust that u8x8 function.

Coming back to your problem: Your IDE will probably use UTF8 encoding, however U8x8 assumes ISO/IEC 8859 encoding... so no wonder the wrong chars will appear. So better use the \xYY number code from the u8x8 font table.

Herwig9820 commented 1 month ago

OK, thanks for the quick reply.