Chinese character can't be displayed with printFixed()

cwl769 commented 3 months ago

Describe the bug When I was using function printFixed() to show Chinese characters, it shows some other words. IMG_20240804_184149

To Reproduce Steps to reproduce the behavior:

Create font in Chinese.
Try to print characters on your lcd. For example printFixed("你好");.

Expected behavior See "你好" on my lcd. IMG_20240804_174925

Please complete the following information:

library version: make after commit 4a34c0a
LCD display type: ssd1306 I2C
OS: linux
Platform: raspberryPi
IDE: none

cwl769 commented 3 months ago

I have found that the reason is that the function uint16_t NanoFont::unicode16FromUtf8(uint8_t) defined in src/canvas/font.cpp only support UTF-8 char which contains at most 2 bytes(aka U+0000 ~ U+07FF). Since Chinese characters are from U+4E00 to U+9FA5, they can't be displayed properly

lexus2k commented 3 months ago

Could you please share the font you use in your example? How did you generated it? UTF-8 standard supports up to 4 bytes, the library itself support 16-bit unicodes.

lexus2k commented 3 months ago

This is the function how unicode codes are used in the part processing fonts:

static const uint8_t *ssd1306_readUnicodeRecord(SUnicodeBlockRecord *r, const uint8_t *p)
{
    r->start_code = ((static_cast<uint16_t>(pgm_read_byte(&p[0])) << 8)) | static_cast<uint16_t>(pgm_read_byte(&p[1]));
    r->count = pgm_read_byte(&p[2]);
    return (r->count > 0) ? (&p[3]) : nullptr;
}

No limitation except that unicode must fit 16 bits.

And here is the function, which converts utf-8 to unicodes:

uint16_t NanoFont::unicode16FromUtf8(uint8_t ch)
{
#ifdef CONFIG_SSD1306_UNICODE_ENABLE
    static uint16_t unicode = 0;
    ch &= 0x00FF;
    if ( !unicode )
    {
        if ( ch >= 0xc0 )
        {
            unicode = ch;
            return SSD1306_MORE_CHARS_REQUIRED;
        }
        return ch;
    }
    uint16_t code = ((unicode & 0x1f) << 6) | (ch & 0x3f);
    unicode = 0;
    return code;
#else
    return ch;
#endif
}

This is the only place which should be updated. It can be changed like this:

uint16_t NanoFont::unicode16FromUtf8(uint8_t ch)
{
#ifdef CONFIG_SSD1306_UNICODE_ENABLE
    static uint16_t unicode = 0;
    static uint8_t ucode_bytes = 0;
    static uint8_t ucode_index = 0;
    ch &= 0x00FF;
    if (ucode_index == 0)
    {
        if ( ch < 0xc0 )
        {
            return ch;
        }
        ucode_index++;
        if ( ch < 0xe0 )
        {
            ucode_bytes = 1;
            unicode = ch & 0x1f;
        }
        else if ( ch < 0xf0 )
        {
            ucode_bytes = 2;
            unicode = ch & 0x0f;
        }
        else
        {
            ucode_bytes = 3;
            unicode = ch & 0x07;
        }
    }
    else
    {
        unicode = (unicode << 6) | (ch & 0x3f);
        if (ucode_index == ucode_bytes)
        {
            ucode_index = 0;
            return unicode;
        }
        ucode_index++;
    }
    return SSD1306_MORE_CHARS_REQUIRED;
#else
    return ch;
#endif
}

cwl769 commented 3 months ago

I have put my font here, and explained how I made it in the README file.

cwl769 commented 3 months ago

The part of processing fonts works well, but we can't get a return value greater than 0x07ff in the old version.

I also update that function, and my implementation is this:

uint16_t NanoFont::unicode16FromUtf8(uint8_t ch)
{
#ifdef CONFIG_SSD1306_UNICODE_ENABLE
    static uint16_t unicode = 0;
    static uint8_t rest = 0;
    if(ch & 0x80)
    {
        if(ch & 0x40)
        {
            uint8_t mask = 0x1f;
            rest = 1;
            while( ((~mask) & ch) == ((~mask) & 0xff) ) mask >>= 1, ++rest;
            unicode = ch & mask;
            return SSD1306_MORE_CHARS_REQUIRED;
        }
        else
        {
            unicode = (unicode << 6) | (ch & 0x3f);
            return (--rest) ? SSD1306_MORE_CHARS_REQUIRED : unicode;
        }
    }
    return ch;
#else
    return ch;
#endif
}

I run this code on my raspberry pi, and it works as expected.

cwl769 commented 3 months ago

By the way, I can't understand the meaning of this:

ch &= 0x00FF;

Please advise.

(I'm sorry.I'm a beginner and the question may seem stupid.)

lexus2k commented 3 months ago

Hi you don't have to say sorry. There is no meaning, this can be removed.

lexus2k / lcdgfx

Chinese character can't be displayed with printFixed() #122