Closed msftrncs closed 1 year ago
@msftrncs Glad to see someone who knows their regex!
I've been working on a cleaned up version of this repo over here if you want to refer to it. Eventually I'll merge it into this one.
Sadly block_comment and inline_comment are both needed and are different. Just try removing one and you'll probably be able to see the pattern. Inline comment is should be faster than block, but it only works if its one line. Block is the only one capable of multi-line but because of that it can't be embedded into other single line patterns, like the std_space pattern.
I'll have to spend a bit more time looking at the improved version. Only numbered capture groups are allowed in Textmate, so the library has to know about capture groups in order to correctly line up them up with the correct textmate scope when they're embedded into other pattterns. This means it might be hard to get that kind of low level optimization, and the $0'th capture group might not be present when the pattern is embedded in another pattern.
I'm not exactly sure what you mean by "Inter construct inline comment capturing", or what you're recommending. If those inline comments are not injected 377 times, the grammar can't parse correctly as far as I understand. Textmate is limited and aggressive copy/pasting patterns everywhere is often the only what to get correct parsing. I created the ruby library to specifically do that kind of agressive copy-pasting in a maintainable way.
I changed the inline comment regex to your version 👍
Checklist
"C_Cpp.enhancedColorization": "Disabled"
The Criticism
The concern here is: (only speaking for the resulting tmLanguage file)
block_comment
andinline_comment
are effectively identical and thus redundant, andblock_comment
could serve for all of the/**/
comment needs.inline_comment
otherwise has room for performance improvement.(\/\*)((?:[^\*]|(?:\*)++[^\/])*+((?:\*)++\/))
(\/\*)(?:[^*]++|\*+(?!\/))*+(\*\/)
/****** this is a ****** comment ******/
there is a 300% step reduction just on the match.inline_comment
.inline_comment
, yet three original captures are further scoped in the construct's captures, literally identically toinline_comment
, a total of 18 lines in the tmLanguage file per instance, and there appears to be approximately 377 instances, or nearly 6800 total lines of a 20,000 line document.