Open ShufaW opened 3 months ago
It is definitely possible to support Chinese parentheses. (I didn't even realize they were different!) Do you have a list of punctuation equivalents, other than parentheses, that need to be supported? Here's the list so far.
(
)
I see a list of full-width punctuation equivalents here:
https://en.wikipedia.org/wiki/Chinese_punctuation
Is that a good list of Unicode symbols to define as equivalents?
Yes, I think this is a professional list.
Great, thanks for confirming! Do you have any interest in helping implement this? This issue isn't too difficult; it mainly involves adding some extra lines to Tokenizer.ts
. If that sounds like a fun task, I can guide you on the PR process.
Yes sure! I can implement this.
The key place to look is Tokenizer.ts
. There you'll see a list of token definitions, such as the one defined for Sym.EvalOpen
and Sym.EvalClose
. Those define strings or regular expressions that define particular token types. You'll want to create additional entries in this list, likely right next to the corresponding tokens (order matters), having those new Unicode full-width versions also count as the same token types. The pull request should also add test cases to Tokenizer.test.ts
to ensure that they tokenize correctly, at least one for each added symbol.
Hi Amy @amyjko , I would like to request to be assigned to this issue, following the instruction given above to fixed this locale problem.
Wonderful, it's yours! See the pointers above, I'm happy to elaborate.
Oh amy, I guess you assigned the wrong guy🤣 @amyjko
Oops, fixed.
What are you trying to do that you can't?
For example, when I try to type "Stage([Shape(Circle(1m 0m 0m))])" in Chinese, it becomes "舞台([形状(圆圈(1m 0m 0m))])". However, I must use the English punctuation marks like "(", and ")", instead of the Chinese versions "(", and ")", because the Chinese punctuation marks are not recognized. This makes it challenging for users in non-English environments, as they need to spend considerable time converting punctuation marks between languages.
What is your idea?
Is it possible to improve software localization to recognize and interpret different punctuation marks and symbols based on the user's language settings. This would allow users to use their native signs without issues. Or there could have more shortcuts for more frequently used signs?