sublimehq / sublime_text

Issue tracker for Sublime Text
https://www.sublimetext.com
807 stars 39 forks source link

Context with a lookahead and `clear_scopes` removes scopes in other contexts #3870

Open imbrish opened 3 years ago

imbrish commented 3 years ago

Description

When clear_scopes: true is used in a context with a lookahead in match, the scopes will also be cleared for text matched in other contexts, if the matched text has no explicit scope assigned.

Steps to reproduce

  1. Create Syntax Minimal.sublime-syntax in the User package:
    
    %YAML 1.2
    ---
    name: 'Syntax Minimal'
    scope: invalid.illegal

contexts: main:

  1. Apply Syntax Minimal to the Test Minimal.txt.

Expected behavior

image

Actual behavior

image

  1. Once the issue is triggered, the example can be modified freely and scopes around [this] will always be cleared.
  2. If the example is modified to a state that would not trigger the issue in the first place, re-saving the unmodified syntax fixes it.
  3. Now the changes in the example can be undone and the highlighting will be correct!
  4. The problem occurs for all matched text, that does not have an additional scope assigned, so removing captures will cause invalid.illegal to be cleared from the whole line.
  5. Replacing match: '^\s*(?=#)' with match: '^(?=\s*#)' ie. moving the whole match into lookahead, fixes the issue.

Environment

deathaxe commented 3 years ago

That's expected behavior.

The clear_scopes command is so to say a counter part of meta_scope and behaves like that with regards to boundaries. It means it clears meta scopes from current context including the region captured by the initial match, which pushed the context onto stack.

You need an intermediate context to solve that issue

  preprocessor:
    - match: '^(?=\s*#)'
      push:
        - match: '#'
          set:
            - clear_scopes: true
            - meta_content_scope: support.function
            - match: '(?=\s*$)'
              pop: true

What we'd need to solve such kinds of issues was a clear_content_scopes as counter part of meta_content_scopes to support your kind of use case. I already had 1 or 2 situations when I whished something like that, too.

The main purpose of clear_scopes is to support things like string interpolation by removing the string scope within the whole interpolation region, which normally includes opening and closing punctuation.

imbrish commented 3 years ago

I understand your point. The match that clears the scopes is included in the main context, so it clears the scopes of the main context. But when the pushed scope pops, it should no longer apply, right?

But even if that's the expected behavior, then it is still inconsistent. My point is that it affects further matches in a weird way:

  1. If text is matched and assigned an additional scope, the main scope is not cleared. If text is matched without adding a scope, the main scope does get cleared.
  2. Line 3 in the example gets cleared, but not line 2, where nothing is matched and parses just moves past it.
  3. If the example does not contain # in the first line when the syntax file is saved, the scopes from the other lines will never be cleared, even if # is added back. If # is present when the syntax is saved, the scopes will always be cleared, even if # is deleted.
deathaxe commented 3 years ago

I see what you mean.

The clear_scope command is applied to the consuming part of \s*(?=#) even though the pattern does not fully match and thus the context which contains clear_scope not to be pushed onto stack. Thus that command should not have an effect at all on lines other than 1.

That's a bug.

I can still reproduce it on ST 4094 even with setting sublime-syntax to version: 2.

imbrish commented 3 years ago

I think the issue is even more involved than that. Note, that what we match is actually ^\s*(?=#), with ^ in front, so it should only ever match at the start of a line. But it does not clear scopes only on starting spaces, it clears scopes on anything that is subsequently matched, as long as no new scopes are added. You could try this by replacing \s* with -*. It still has the exact same effect.