VincentFoulon80 / console_engine

A simple terminal framework to draw things and manage user input
https://crates.io/crates/console_engine
MIT License
220 stars 7 forks source link

Emojis causing trouble #8

Closed 123marvin123 closed 3 years ago

123marvin123 commented 3 years ago

Describe the bug Using emojis in the console output will cause a shift when drawing shapes. I think this problem applies to all characters that use more than 1 bytes?

To Reproduce

use console_engine::screen::Screen;
use console_engine::pixel;

fn main() {
    // create a screen of 20x11 characters
    let mut scr = Screen::new(20,11);

    // draw some shapes and prints some text
    scr.rect(0,0, 19,10,pixel::pxl('#'));
    scr.fill_circle(5,5, 3, pixel::pxl('*'));
    scr.print(11,4, "💥");
    scr.print(11,5, "World!");

    // print the screen to the terminal
    scr.draw();
}

Expected behavior

####################
#                  #
#   ***            #
#  *****           #
# *******  💥      #
# *******  World!  #
# *******          #
#  *****           #
#   ***            #
#                  #
####################

Actual behavior

####################
#                  #
#   ***            #
#  *****           #
# *******  💥       #
# *******  World!  #
# *******          #
#  *****           #
#   ***            #
#                  #
####################

Additional context Running on macOS.

VincentFoulon80 commented 3 years ago

Thanks for reporting, it's definitely a bug, though it may be difficult to get rid of ... If you got any idea of how to fix it I would be thankful 😅

VincentFoulon80 commented 3 years ago

Looking at the result, I think that's because emojis occupy two characters instead of one. that would mean we need to skip one char after the emoji, and that is where issues can happen. If you print an emoji at (11, 4), it shows up on the 11th and 12th characters. What should happen if you then print something at (12,4) ? 🤔

123marvin123 commented 3 years ago

Well, it occupies the space of two characters, but will still be displayed as one. You can print the emoji at (11,4) and print something else at (12,4) and it will appear too.

image

Moving away from counting chars means that we would need to save unicode graphemes instead of individual u8 chars per Pixel. I think TermWiz does this too: https://crates.io/crates/termwiz

VincentFoulon80 commented 3 years ago

Well, it occupies the space of two characters, but will still be displayed as one. You can print the emoji at (11,4) and print something else at (12,4) and it will appear too.

I meant this if the bug was fixed somehow

Moving away from counting chars means that we would need to save unicode graphemes instead of individual u8 chars per Pixel.

I agree with you, doing this may fix the issue, but we will need to keep an eye on characters that need additional bytes but still has one character length.

Maybe the unicode-width crate could help find a way to fix this ? Edit: looking at termwiz I also found out that unicode-segmentation can be useful too

123marvin123 commented 3 years ago

Well, it occupies the space of two characters, but will still be displayed as one. You can print the emoji at (11,4) and print something else at (12,4) and it will appear too.

I meant this if the bug was fixed somehow

Oh sorry, it was a misunderstand then.

Moving away from counting chars means that we would need to save unicode graphemes instead of individual u8 chars per Pixel.

I agree with you, doing this may fix the issue, but we will need to keep an eye on characters that need additional bytes but still has one character length.

Maybe the unicode-width crate could help find a way to fix this ? Edit: looking at termwiz I also found out that unicode-segmentation can be useful too

I already tried using unicode-segmentation and did some modifications. Sadly, it doesn't work and I don't know why, maybe you can take a look at my fork (https://github.com/123marvin123/console_engine/commit/2303c15fccc9133132a06b3f272bf68dc54c05b9#diff-74659ea3e6d931d656bc58a408a09b8b548e4e29a04ecb3c6a06163f18677a3bR179)? I had to do a lot of cloning because I needed to change the Pixel struct to use strings instead of u8 chars which can be easily copyable in contrast to strings...

VincentFoulon80 commented 3 years ago

It seems that

"💥".graphemes(false).count();

returns 1. I tried to switch the parameter but I still get 1...

123marvin123 commented 3 years ago

Ok, I think we don't really need unicode-segmentation. Unicode-width is what we need I believe. Printing the width of the 💥 emoji returns 2. I tried going into the screen::print() function and move the terminal cursor to the left if we have a character that has a display width of > 2 but somehow the whole character will disappear?

    pub fn draw(&self) {
        let mut output = std::io::stdout();
        crossterm::terminal::enable_raw_mode().unwrap();
        for i in 0..self.width * self.height {
            let pixel = &self.screen[i as usize];
            let char_width = (UnicodeWidthChar::width(pixel.chr).unwrap() - 1) as u16;
            execute!(
                output,
                style::SetForegroundColor(pixel.fg),
                style::SetBackgroundColor(pixel.bg),
                style::Print(pixel.chr),
                crossterm::cursor::MoveLeft(char_width)
            )
            .unwrap();

            if i != self.width * self.height - 1 && i % self.width == self.width - 1 {
                execute!(output, style::Print("\r\n")).unwrap();
            }
        }
        crossterm::terminal::disable_raw_mode().unwrap();
    }
VincentFoulon80 commented 3 years ago

Hmm.. maybe try skipping characters when encountering character with width > 1 ? Skipping only one should do the trick, and will avoid the issue with some emojis having width of 4 instead of 2 edit: by skipping I mean ignoring the next character

123marvin123 commented 3 years ago

Hmm.. maybe try skipping characters when encountering character with width > 1 ? Skipping only one should do the trick, and will avoid the issue with some emojis having width of 4 instead of 2 edit: by skipping I mean ignoring the next character

What exactly do you mean by ignoring the next character? I checked the rust documentation and it says that every character in rust is 4 bytes long. Which means, Emojis and all unicode characters will fit in a rust character, but the terminal still prints them at a bigger width. So I can't really "skip" rust characters, I can only move the cursor.

VincentFoulon80 commented 3 years ago

What I meant was something like this:

let mut skip_next = false;
for i in 0..self.width * self.height {
    if skip_next { 
        skip_next = false;
        continue; 
    }
    if UnicodeWidthChar::width(pixel.chr).unwrap() > 1 {
        skip_next = true;
    }
    // ...
}
VincentFoulon80 commented 3 years ago

Here's a fix for this issue : compare bugfix/8-emojis with master Can you try it out and tell me if it works for you ? Is this solution acceptable ?

Side note: Some terminals that don't support emojis will have the opposite issue (but not all): image

123marvin123 commented 3 years ago

Here's a fix for this issue : compare bugfix/8-emojis with master Can you try it out and tell me if it works for you ? Is this solution acceptable ?

Side note: Some terminals that don't support emojis will have the opposite issue (but not all): image

Thanks for your time investment, I haven't tried to implement your solution myself. I cloned the bugfix branch and it works great :) You could add the emoji code branch as a feature, so if somebody only wants to use ascii, he/she can turn unicode support off & on.

VincentFoulon80 commented 3 years ago

Thinking about it, is there really a need for a separate feature in order to display unicode characters ? I mean, either you use a unicode character and get a normal result but with some terminals not handling them correctly, or you don't use them and this behavior never happen. 🤔

VincentFoulon80 commented 3 years ago

Resolved in v1.5.1