Interpretation of incomplete selectors is erratic

Let's define an incomplete selector as a CSS selector that the user is in the process of typing which doesn't yet have any trailing braces.

This is the current behavior of tree-sitter-css. In these examples, assume we’re in an otherwise empty file:

Bare tag name

div

stylesheet [0, 0] - [1, 0]
  ERROR [0, 0] - [0, 3]
    attribute_name [0, 0] - [0, 3]

Tag name with pseudoclass

div:foo

stylesheet [0, 0] - [1, 0]
  declaration [0, 0] - [0, 7]
    property_name [0, 0] - [0, 3]
    plain_value [0, 4] - [0, 7]
    MISSING ; [0, 7] - [0, 7]

Pseudoclass by itself

:foo

stylesheet [0, 0] - [1, 0]
  ERROR [0, 0] - [0, 4]
    pseudo_class_selector [0, 0] - [0, 4]
      class_name [0, 1] - [0, 4]

Attribute selector

div[foo]

stylesheet [0, 0] - [1, 0]
  ERROR [0, 0] - [0, 8]
    attribute_selector [0, 0] - [0, 8]
      tag_name [0, 0] - [0, 3]
      attribute_name [0, 4] - [0, 7]

For the most part, there’s a logic to these examples that I can reason through:

The last two examples are properly interpreted as selectors because they’re unambiguous in their context.
In the second example, I understand the ambiguity in theory, but I don’t think the parser should choose “property-value pair” as its first interpretation of div:foo. It makes more sense to me for the parser to interpret it as a selector until it has more information.
The first example’s output doesn’t make sense to me at all.

What’s tricky about these examples is that they all behave differently in an empty file — or at the very end of a file — than they do if the incomplete selector is being typed in the middle of an existing file…

div

div {}

…in which case the parser joins it to the following selector and parses it less ambiguously.

I found all this while trying to adapt Pulsar’s autocomplete-css, which never properly worked with the Tree-sitter CSS grammar. If a user is typing a selector in a brand-new CSS file, we can’t accurately highlight certain selectors until the user types {; not ideal, but understandable.

But it also means that autocompletion suggestions will be either missing or flat-out wrong. My wish is that tree-sitter-css be able to parse these incomplete selectors well enough for us to be able to use that output to offer accurate contextual completions.

My suggested fixes for the first two examples would be something like this:

Bare tag name

div

stylesheet [0, 0] - [1, 0]
  ERROR [0, 0] - [0, 3]
    tag_name [0, 0] - [0, 3]

Tag name with pseudoclass

div:foo

stylesheet [0, 0] - [1, 0]
  ERROR [0, 0] - [0, 7]
    pseudo_class_selector [0, 0] - [0, 7]
      tag_name [0, 0] - [0, 3]
      class_name [0, 4] - [0, 7]

In other words, the parser should assume that arbitrary text at the root of a document is a selector — even when it isn’t yet valid — until that text is unambigously something else.

In the example of a user typing div:foo, I’d want the parser to assume it’s dealing with a CSS selector after every keystroke. If the user instead typed div: foo, I’d expect the parser to change its mind after the space character, and no earlier.

I think this behavior could be introduced without breaking any existing CSS parsing — even in, for example, a case where someone wants to highlight

display: block;

as CSS on a blog post without having to place it inside a selector.

I might try to contribute a PR for this one day if I get more comfortable working with parsers.

tree-sitter / tree-sitter-css