yzhang-gh / vscode-markdown

Markdown All in One
https://marketplace.visualstudio.com/items?itemName=yzhang.markdown-all-in-one
MIT License
2.91k stars 322 forks source link

Poor table alignment for some characters #1220

Open 1284685832 opened 1 year ago

1284685832 commented 1 year ago

What's the problem

image You can see the table is not aligned.

What's the expected result

According to the Punctuation marks in Chinese in Requirements for Chinese Text Layout, I think there are some full-width characters that are not included.

They are:

—   2014    EM DASH   
‘   2018    LEFT SINGLE QUOTATION MARK  // Half-width, Full-width in Chinese Font  
’   2019    RIGHT SINGLE QUOTATION MARK // Half-width, Full-width in Chinese Font  
“   201C    LEFT DOUBLE QUOTATION MARK  // Half-width, Full-width in Chinese Font  
”   201D    RIGHT DOUBLE QUOTATION MARK // Half-width, Full-width in Chinese Font  
…   2026    HORIZONTAL ELLIPSIS  //Full-width 
‼   203C    DOUBLE EXCLAMATION MARK  
⁇   2047    DOUBLE QUESTION MARK  
⋯   22EF    MIDLINE HORIZONTAL ELLIPSIS  //Full-width
・       30FB    KATAKANA MIDDLE DOT     //Half-width            
﹏      FE4F    WAVY LOW LINE                //  Full-width 
●   25CF    BLACK CIRCLE  
⸺   2E3A    TWO-EM DASH // The character width is twice that of a Full-width character 
⸻   2E3B    THREE-EM DASH    // The character width is 3 times wider than a Full-width character

How to reproduce

Consider more cases when calculating width.

Other information

This is a test table:

| Example            | Length                                       | Total length  | Name                     | String example                                                                                        |
| ------------------ | -------------------------------------------- | ------------- | ------------------------ | ----------------------------------------------------------------------------------------------------- |
| "By the way......" | 1 (narrow)                                   | 15 En         | East_Asian_Halfwidth(H)  | !@#$%^&*()_+-=:>?qwertyuiop[]\asdfghjkl;'zxcvbnm./                                                    |
| 「顺带提一嘴——」   | 2 (wide)                                     | 18 En         | East_Asian_Full_width(F) | !@#¥%……&*()——+-=:《》?qwertyuiop[]\asdfghjkl;zxcvbnm,./ |
| “顺带提一嘴……”     | ? (default narrow & wide in East Asian Font) | 16 En / 18 En | East_Asian_Ambiguous(A)  | ”“‘’                                                                                                  |
yzhang-gh commented 1 year ago

Thanks for the feedback and information.

We currently use this regex to detect CJK characters

https://github.com/yzhang-gh/vscode-markdown/blob/c5f6f672293e1b1e3c7397c91d9816d19862ee08/src/tableFormatter.ts#L138-L139

Guess it is easy to add those punctuation marks. Would you be willing to open a PR for it? 😉

1284685832 commented 1 year ago

I'm not quite sure if relying on cjkRegex can handle superwide characters like ⸺ 2E3A TWO-EM DASHand⸻ 2E3B THREE-EM DASH, perhaps need more modifications.

Thanks for the feedback and information.

We currently use this regex to detect CJK characters

https://github.com/yzhang-gh/vscode-markdown/blob/c5f6f672293e1b1e3c7397c91d9816d19862ee08/src/tableFormatter.ts#L138-L139

Guess it is easy to add those punctuation marks. Would you be willing to open a PR for it? 😉

yzhang-gh commented 1 year ago

Of course you can introduce cjkRegex2 and cjkRegex3 if needed.