jesse99 / Linguist

Generalized syntax highlighter addin for Visual Studio 2010
MIT License
5 stars 0 forks source link

Multiline Regex #2

Open mbdavid opened 12 years ago

mbdavid commented 12 years ago

Hello!

I'm trying to make multiline regex but it's not working. My expression is: anything between @{ and @} (but not include @{ and @})

I try:

Header2: (?<=@{)(?:.|\n|\r)*?(?=@})

or

Header2: (?<=@{)[\s\S]*?(?=@})

does not work. Do you have any idea?

Thanks

jesse99 commented 12 years ago

It's tricky because you want to match @}} using a non-greedy repetition operator but as soon as the engine sees @} it will stop matching to satisfy the non-greedy criteria.I expect you could address it by making the match non-greedy and using balancing group definitions (which allow the engine to match nested expressions). See http://msdn.microsoft.com/en-us/library/bs2twtah.aspx#balancing_group_definition.

mbdavid commented 12 years ago

Hi Jesse, thanks for your anwser. I tested many regex with multi-line and no one works, including C/C++/C# multi line comment (included on language folder) /* (?: . | \r | \n)? \/ When I debug the code, I put a breakpoint at line 70 (Language.cs): MatchCollection matches = m_regex.Matches(text); and always "text" variable is only 1 line (when editing, is current line editing, when open file, all source was passed line per line). I´m using lastest source code (v0.4)

jesse99 commented 12 years ago

Ahh, I had forgotten about this. It's a fundamental limitation of Linguist. Linguist is an Microsoft.VisualStudio.Text.Classification.IClassifier which is one of the simpler ways to extend studio, but in normal operation Linguist is called with spans of text covering one line at a time. And, as far as I can tell, there is no way for a classifier to get all of the text.