biomejs / biome

A toolchain for web projects, aimed to provide functionalities to maintain them. Biome offers formatter and linter, usable via CLI and LSP.
https://biomejs.dev
Apache License 2.0
13.03k stars 401 forks source link

πŸ› biome breaks characters in stdin mode #2988

Open ton1517 opened 2 months ago

ton1517 commented 2 months ago

Environment information

CLI:
  Version:                      1.7.3
  Color support:                true

Platform:
  CPU Architecture:             aarch64
  OS:                           macos

Environment:
  BIOME_LOG_DIR:                unset
  NO_COLOR:                     unset
  TERM:                         "screen-256color"
  JS_RUNTIME_VERSION:           "v20.12.2"
  JS_RUNTIME_NAME:              "node"
  NODE_PACKAGE_MANAGER:         "yarn/4.1.0"

Biome Configuration:
  Status:                       Loaded successfully
  Formatter disabled:           false
  Linter disabled:              false
  Organize imports disabled:    false
  VCS disabled:                 false

Workspace:
  Open Documents:               0

What happened?

When using the stdin mode in combination with β€œ--colors off", some characters are displayed with broken characters.

image

I'm not a rustacean but have investigated the code.

The broken characters are Combining character which is represented by multiple code points.

fn main() {
    println!("{}", "⚠️".escape_unicode()); // \u{26a0}\u{fe0f}
    println!("{}", "aΜ‚".escape_unicode()); // \u{61}\u{302}
    println!("{}", "「゙".escape_unicode()); // \u{ff76}\u{ff9e}
    println!("{}", "πŸ‘¨πŸ»β€πŸ¦±".escape_unicode()); // \u{1f468}\u{1f3fb}\u{200d}\u{1f9b1}
}

--colors off or not using terminal will result in ColorChoice::Never. https://github.com/biomejs/biome/blob/fec262f1593c53e4d6c46f6934e3e2ebc2144edc/crates/biome_console/src/lib.rs#L94-L114

If you run biome format in stdin mode, the entire code is output here. https://github.com/biomejs/biome/blob/fec262f1593c53e4d6c46f6934e3e2ebc2144edc/crates/biome_cli/src/execute/std_in.rs#L66-L68

Finally, this write_str is called. https://github.com/biomejs/biome/blob/fec262f1593c53e4d6c46f6934e3e2ebc2144edc/crates/biome_console/src/write/termcolor.rs#L146

In the write_str method, convert characters in the case of Windows or ColorChoice::Never. By grapheme.chars().nth(0), all combining characters output only the first code point. https://github.com/biomejs/biome/blob/fec262f1593c53e4d6c46f6934e3e2ebc2144edc/crates/biome_console/src/write/termcolor.rs#L167-L173

codesandbox

https://codesandbox.io/p/devbox/serene-swirles-6z4jfd?embed=1&file=%2Fsrc%2Findex.js%3A5%2C1&workspaceId=6833e942-7d98-4ed2-8f40-9a265aa14215

related issues

Expected result

The output of the code should not be converted anything.

Code of Conduct

ematipico commented 2 months ago

Your investigation will be VERY USEFUL! Thank you!

As a workaround, you can use --colors=force.

ton1517 commented 2 months ago

@ematipico I usually run biome check --apply --stdin-file-path $FILENAME in vim. When I add --colors force, the characters are not broken, but the terminal format code are inserted at the beginning and end of the file.

vim