dandavison / delta

A syntax-highlighting pager for git, diff, grep, and blame output
https://dandavison.github.io/delta/
MIT License
21.36k stars 360 forks source link

🚀 diff chinese words better #1574

Open Freed-Wu opened 7 months ago

Freed-Wu commented 7 months ago

Currently:

Screenshot from 2023-11-29 13-00-24

left is 图片的地址大小, right is 图片的地址

Chinese words doesn't be separated by spaces, in fact, 图片的地址大小 is:

图片 picture 的 's 地址 address 大小 size

So in semantic of sentence, left should be 大小, right should be none.

dandavison commented 7 months ago

Thanks, this is an interesting problem. I guess the question is: what is the standard technique in a modern Rust application for performing this segmentation?

Freed-Wu commented 7 months ago

In other languages, I know some libraries to segmentation:

I guess rust also have similar libraries, although I haven't searched.

zhaihao commented 1 month ago

Thanks, this is an interesting problem. I guess the question is: what is the standard technique in a modern Rust application for performing this segmentation?

In fact, Chinese does not need segmentation; character-level highlighting is sufficient.