vkocubinsky / SublimeTableEditor

This package is no more supported. I moved to vim.
Other
494 stars 67 forks source link

Wide characters support! #17

Closed zealic closed 11 years ago

zealic commented 11 years ago

The table alignment can not work on wide character text.

So I added wide characters support! (CJK : Chinese, Japanese, Korean.)

vkocubinsky commented 11 years ago

Thanks for the code!

zealic commented 11 years ago

Please merge it, Thanks.

tjdoc commented 11 years ago

Another vote here. Please add this feature. Thanks!

vkocubinsky commented 11 years ago

I have 2 concerns:

1) Table Editor distributed under GPL License , file widechar_support.py from your patch references to http://svn.edgewall.org/repos/trac/trunk/trac/util/text.py. So if I get code from patch I have to include Trac License "Copyright (C) 2003-2013 Edgewall Software". 2) I quick looked on the problem and I found that some unicode characters has length 3 or oven 4 normal characters. Ideally I like ask Sublime for calculate length, in fact Sublime knows real character length, may be in Sublime 3 this information will be available. Other approach is find some public algorithm for calculate length, analog of your _widecharsupport.wlen function.

Thanks!

zealic commented 11 years ago

1) breakable char ranges is defined by the Unicode standard. They are just some of the range of numbers.

We can change reference to unicode.org.

2) CJK characters width always be fixed two character, See here


Please merge it, we need it.

vkocubinsky commented 11 years ago

I will merge

vkocubinsky commented 11 years ago

I tried the patch. The problem is length 'wide characters' is not equal to length 2 normal characters, on my Sublime length is 5/3 normal characters. For example next line pairs
have the same width on my Sublime 2 and Sublime 3(OS X 10.8.4)

素晴素晴素晴
1234567890

之后本片也多次发行了家用
12345678901234567890
zealic commented 11 years ago

Thank you for your work.

I think you not set Sublime font as monospaced font, check your font setting, then tell me result.
OSX default Chinese font is Heiti SC, this font is not monospaced font.

You can use this Open Source font : wqy-zenhei

Change your Sublime font to WenQuanYi Zen Hei Mono, you can see text width was corrected.

vkocubinsky commented 11 years ago

By default OS X use monospaced fonts. I checked this by typing 'w' character. I tried different built-in monospaced fonts in Max OS: Monaco, Courier, Menlo, Osaka, Andale Mono, Courier New. Proportion for different fonts can be a bit different, but I did not observe that wide character has length exactly 2 normal characters. Probably WenQuanYi Zen Hei Mono works, right now I don't know. Will check situation on Windows and Linux.

zealic commented 11 years ago

'Monaco, Courier, Menlo, Osaka, Andale Mono, Courier New' not include CKJ charecters, so OS X will use default font Heiti SC.

WenQuanYi Zen Hei Mono can work on my Linux and Windows. I'm not have OS X machine, so I can not check it.

vkocubinsky commented 11 years ago

I made a branch wchar where your patch applied to pandoc, emacs, simple , reStructuredText syntax. You can try this. I don't know is it works or no.

I tried to install WenQuanYi Zen Hei Mono on my Windows 7 computer. I copied wqy-zenhei.ttc into C:\Windows\Fonts. I saw 2 new fonts in this folder: WenQuanYi Zen Hei Medium, WenQuanYi Zen Hei Mono Medium. It is a bit strange that I saw only 'medium'.

In microsoft word I saw expected result with WenQuanYi Zen Hei Mono, but in Sublime I saw that width CKJ character is not equal length of 2 normal characters. It looks that Sublime doesn't see WenQuanYi Zen Hei Mono and set font by default.

tjdoc commented 11 years ago

Hi,

I tested your branch on my mac with Korean. Works well for Korean in general but grid table format seems to be broken (for both English and Korean). See the linked file for the test results.

https://dl.dropboxusercontent.com/u/1615626/TableEditor_sublime_wchar_test.7z

Cheers!

zealic commented 11 years ago

Thanks @tjdoc, I also tested my pandoc make from Markdown to PDF. Working properly.

@vkocubinsky I tried to test wchar branch in my windows, using WenQuanYi Micro Hei Mono font, it's not work.

But below fonts work:

Window allow multiple fonts work in parallel, see HERE Font Linking section.

So choose in Chinese version windows use english font will apply default chinese font. (Windows7 use Microsoft YaHei), but Microsoft YaHei ASCII partial is not monospaced. Also similar to english version windows.

tjdoc commented 11 years ago

I realised that I was not using the right syntax for the grid_table. After correcting the syntax, everything is working as expected (for both Korean and English). No issues.

Sorry for the confusion and thanks for the great work!

vkocubinsky commented 11 years ago

I success formatted table with fronts from @tjdoc. Probably "WenQuanYi Micro Hei Mono" also works, because yesterday I incorrect set font face in Sublime. I commited changes for textile syntax and it looks that formatting work correct for all syntax.

Navigation (Prev, Next commands) works incorrect. The reason is TableDriver.get_cursor method. I will move method TableDriver.get_cursor into table_plugin and rewrite.

vkocubinsky commented 11 years ago

I fixed navigation, please try changes. For me it looks that Table Editor works for CJK

tjdoc commented 11 years ago

Navigation feature seems to work fine for Korean.

It doesn't work properly when the last character in the table cell has a square, which surrounds the character. This indicates that the character input is incomplete. You can force character input completion by

Usually, even when the last character has squares surrounding it, pressing tab should force character completion and move on. However, in sublime text, the tab key deletes the last character without inserting the tab. I don't think this is a sublime Table Editor problem but rather an issue with the Sublime text editor itself. I'll submit a bug report to Sublime Text.

Thanks for your work. Cheers!

vkocubinsky commented 11 years ago

As we speak before we have to get rid from reference to http://svn.edgewall.org/repos/trac/trunk/trac/util/text.py. I tried make breakable_char_ranges from wiki http://en.wikipedia.org/wiki/Han_unification and match wiki with widechar_support.py. I saw some not existing char ranges in wiki Hiragana, Katakana, Hangul Compatibility Jamo, Kanbun, Hangul Jamo. I saw some not existing char range in widechar_support.py. Can we use char ranges only from wiki? If no is there full list of char ranges in the internet?

Thanks!

|                         widechar                         |                       wiki                       |
+----------------------------------------------------------+--------------------------------------------------+
| (0x1100, 0x11FF),   # Hangul Jamo                        |                                                  |
| (0x2E80, 0x2EFF),   # CJK Radicals Supplement            | CJK Radicals Supplement (2E80–2EFF)              |
| (0x3000, 0x303F),   # CJK Symbols and Punctuation        | CJK Symbols and Punctuation (3000–303F) (chart)  |
| (0x3040, 0x309F),   # Hiragana                           |                                                  |
| (0x30A0, 0x30FF),   # Katakana                           |                                                  |
| (0x3130, 0x318F),   # Hangul Compatibility Jamo          |                                                  |
| (0x3190, 0x319F),   # Kanbun                             |                                                  |
| (0x31C0, 0x31EF),   # CJK Strokes                        | CJK Strokes (31C0–31EF)                          |
| (0x3200, 0x32FF),   # Enclosed CJK Letters and Months    | Enclosed CJK Letters and Months (3200–32FF)      |
| (0x3300, 0x33FF),   # CJK Compatibility                  | CJK Compatibility (3300–33FF) (chart)            |
| (0x3400, 0x4DBF),   # CJK Unified Ideographs Extension A | CJK Unified Ideographs Extension A (3400–4DBF)   |
| (0x4E00, 0x9FFF),   # CJK Unified Ideographs             | CJK Unified Ideographs (4E00–9FFF)               |
| (0xA960, 0xA97F),   # Hangul Jamo Extended-A             |                                                  |
| (0xAC00, 0xD7AF),   # Hangul Syllables                   |                                                  |
| (0xD7B0, 0xD7FF),   # Hangul Jamo Extended-B             |                                                  |
| (0xF900, 0xFAFF),   # CJK Compatibility Ideographs       | CJK Compatibility Ideographs (F900–FAFF)         |
| (0xFE30, 0xFE4F),   # CJK Compatibility Forms            | CJK Compatibility Forms (FE30–FE4F)              |
| (0xFF00, 0xFFEF),   # Halfwidth and Fullwidth Forms      |                                                  |
|                                                          | CJK Unified Ideographs Extension B (20000–2A6DF) |
|                                                          | CJK Unified Ideographs Extension C (2A700–2B73F) |
|                                                          | CJK Unified Ideographs Extension D (2B840–2B81F) |
|                                                          | CJK Compatibility Ideographs (2F800–2FA1F)       |
|                                                          | Ideographic Description Characters (2FF0–2FFF)   |
|                                                          | Kangxi Radicals (2F00–2FDF)                      |
|                                                          |                                                  |
zealic commented 11 years ago

No, Hangul is the Korean characters.

About unicode fully charts, see below pages :

vkocubinsky commented 11 years ago

I released version 1.7.2 with support wide character. I tested base cases. Please report issues if found some bugs.