Closed dehero closed 3 years ago
Thanks, @dehero. Forgive my ignorance with Cyrillic languages! Before I accept your pull request, can you test tags, too (e.g. tag:value
) as those regex patterns also begin and end with the \b
token. I'd like to fix those at the same time, too, if needed. As I'm looking at my code, I can't remember why I think I needed the word boundary in the pattern but there may be some edge cases I need to consider.
Now I fixed tags too.
Before:
After:
Forgive my ignorance with Cyrillic languages!
Better to say that the Cyrillic languages were initially ignored by the developers of regular expressions.
I'm looking at my code, I can't remember why I think I needed the word boundary in the pattern but there may be some edge cases I need to consider.
Regarding boundary tokens, by removing them, we allow not only the use of non-Latin letters, but also use of any other non-whitespace characters, so these become valid:
+pro,ject, // project
@c*(n)tex: // context
\:$ // tag
Though todo.txt format has some sort of specification, I cannot find there details on which symbols are allowed or disallowed. Each editor or highlighter acts on it's own.
I generally think that todotxt-mode token parsing needs some more refinement for not-letter symbols. But for now we just fix a more significant issue. It's obvious that not-Latin letters should be allowed.
Hello.
\b
token at the end of regex searching pattern blocked Cyrillic and other non-Latin names for projects and contexts to be found and highlighted. Removing it solves the problem though now regex can eat more symbols than it was expected initially. I suppose it's worth.Before:
After: