Gruntfuggly / todo-tree

Use ripgrep to find TODO tags and display the results in a tree view
Other
1.42k stars 137 forks source link

Multiline regex highlighting entire file #644

Open lane-hogan opened 2 years ago

lane-hogan commented 2 years ago

I am trying to emulate the same highlighting style as the JetBrains IDEs of highlighting anything under that is a comment and has 2 spaces:

// TODO: Do the stuff
//  and the other stuff

I've tried using the regex that is supplied in the docs ("todo-tree.regex.regex": "(//)\\s*($TAGS).*(\\n\\s*//\\s{2,}.*)*"), but rather than doing what was advertised, it is highlighting everything under the // TAG: rather than just comments with 2 spaces under.

I have tested this while programming C++ and I am on Windows 11. I've also tried other combinations of the regex to get something to work, so not sure if I am just doing something wrong or if there was a bug that was introduced to break the regex.

Logix-Dev commented 1 year ago

I'm also having this issue, and have also tried modifying the RegEx in all sorts of ways, to no avail.

This is a C++ project.

I've tried the one suggested in this issue and the ones suggested in the readme, unsuccessfully.

It proceeds to highlight the entire file (and this is the same when using /* */ block comments with the corresponding RegEx)

image

Gruntfuggly commented 1 year ago

Please can you try using a tool like https://regex101.com/ to test your regex?

lane-hogan commented 1 year ago

I've confirmed on my side that it works on tools such as https://regex101.com/ and it shows correctly on there, just not on this VSCode extension.

ADHDSquir commented 1 year ago

Here's an example of a simple multi-line regex from my settings.json that matches Python but it should work for many other languages if you change the comment character from # to whatever you need. The does not require setting enableMultiLine to true as it explicitly defines line breaks within the expression. This format is from the json so \ characters are escaped as \\. If using the settings GUI replace \\ with a single \

"todo-tree.regex.regex": "^[ \\t]*# ($TAGS)[^\\n]*(\\n[ \\t]*# [^\\n]*)*",

My RegEx skills are lacking but here's an explanation of what this is doing (I think 🙂): ^ Start the match at the beginning of a line. [ \\t]* Match any amount of leading spaces and/or tabs # Match the Python comment character followed by a single space. ($TAGS) Match a todo-tree tag [^\\n]* Match any characters other than a new line. This is the secret to making this whole thing work! Typically you would use . to match all characters except newline but that seems to cause all kinds of issue with todo-tree even though the official examples use . in their code. (\\n[ \\t]*# [^\\n]*)* Finally, match multiple lines (if any) using the same logic as before just without the ($TAGS) and starting with a new line. This matches (\\n new line, [ \\t]* any amount of tabs/spaces, # python comment character followed by single space, followed by [^\\n]* any characters other than a new line) * this entire group will match zero or more times in immediate succession.

CAESIUS-TIM commented 1 year ago

Here's an example of a simple multi-line regex from my settings.json that matches Python but it should work for many other languages if you change the comment character from # to whatever you need. The does not require setting enableMultiLine to true as it explicitly defines line breaks within the expression. This format is from the json so \ characters are escaped as \. If using the settings GUI replace \ with a single \

"todo-tree.regex.regex": "^[ \\t]*# ($TAGS)[^\\n]*(\\n[ \\t]*# [^\\n]*)*",

My RegEx skills are lacking but here's an explanation of what this is doing (I think 🙂): ^ Start the match at the beginning of a line. [ \\t]* Match any amount of leading spaces and/or tabs # Match the Python comment character followed by a single space. ($TAGS) Match a todo-tree tag [^\\n]* Match any characters other than a new line. This is the secret to making this whole thing work! Typically you would use . to match all characters except newline but that seems to cause all kinds of issue with todo-tree even though the official examples use . in their code. (\\n[ \\t]*# [^\\n]*)* Finally, match multiple lines (if any) using the same logic as before just without the ($TAGS) and starting with a new line. This matches (\\n new line, [ \\t]* any amount of tabs/spaces, # python comment character followed by single space, followed by [^\\n]* any characters other than a new line) * this entire group will match zero or more times in immediate succession.

@jshtz4 Thank you very much! This is a significant improvement. But there is still a bug.

Need to be: right

However: bug

$(TAGS) should not appear in the new lines, but rust regex has no feature look-around. I do not know how to solve it.☹

rust crate: regex

This crate provides routines for searching strings for matches of a regular expression (aka “regex”). The regex syntax supported by this crate is similar to other regex engines, but it lacks several features that are not known how to implement efficiently. This includes, but is not limited to, look-around and backreferences. In exchange, all regex searches in this crate have worst case O(m * n) time complexity, where m is proportional to the size of the regex and n is proportional to the size of the string being searched.

CAESIUS-TIM commented 1 year ago
    "todo-tree.regex.enableMultiLine": true,
    "todo-tree.regex.regex": "(//|#|;|%|REM|^[ \\t]*\\*+|--|^|^[ \\t]*(-|\\d+.))\\s*($TAGS)|(\"\"\"|'''|<!--|/\\*+|--\\[\\[)\\s*($TAGS)[\\s\\S]*?(\"\"\"|'''|-->|\\*/|\\]\\])",

image