ionide / ionide-vscode-fsharp

VS Code plugin for F# development
http://ionide.io
MIT License
860 stars 279 forks source link

Non consistent syntax highlighting when deconstructing a record in a function argument #1527

Open MangelMaxime opened 3 years ago

MangelMaxime commented 3 years ago

Describe the bug

When deconstructing a record in a function argument, the color of {, }, ; and = are not consistent.

image

Code for reproduction

type Repro =
    {
        PropA : int
    }

let func1 ({ PropA = propA } : Repro) = 
    printfn "%A" propA

Machine info

open-collective-bot[bot] commented 3 years ago

Hey @MangelMaxime :wave:,

Thanks for backing our project. If possible, We will handle your issue with priority support. To make sure we don't forget how special you are, we added a backer label to your issue.

Thanks again for backing us :tada:!

baronfel commented 3 years ago

To be 100% clear for future-selves, the complaint is that the braces and semicolon are light blue here inside the function parameter deconstruction instead of the same dark blue that they have in the record declaration?

MangelMaxime commented 3 years ago

Yes, you got it right.

On the next picture, I underlined in red the symbol I am referring to.

116430661-565cd180-a847-11eb-8905-17e2d13988fa

baronfel commented 3 years ago

This is because of the textmate/semantic highlighting tokens reported in each case.

For the definition, there is no semantic token for the brace characters, and so the textmate scope for keyword applies. For the deconstruction, there is a semantic token of variable applied here, which then applies the brighter token.

Should look into further into what's going on here.

The scopes returned from FSAC (decoded) for your sample code are

Tokens! ``` [|({ Start = { Line = 0 Character = 0 } End = { Line = 0 Character = 0 } }, Namespace, 0); ({ Start = { Line = 0 Character = 0 } End = { Line = 0 Character = 0 } }, Namespace, 0); ({ Start = { Line = 0 Character = 5 } End = { Line = 0 Character = 10 } }, Type, 0); ({ Start = { Line = 2 Character = 8 } End = { Line = 2 Character = 13 } }, Property, Readonly); ({ Start = { Line = 2 Character = 16 } End = { Line = 2 Character = 19 } }, Struct, 0); ({ Start = { Line = 5 Character = 4 } End = { Line = 5 Character = 9 } }, Function, 0); ({ Start = { Line = 5 Character = 11 } End = { Line = 5 Character = 13 } }, Variable, 0); ({ Start = { Line = 5 Character = 13 } End = { Line = 5 Character = 18 } }, Property, Readonly); ({ Start = { Line = 5 Character = 18 } End = { Line = 5 Character = 21 } }, Variable, 0); ({ Start = { Line = 5 Character = 21 } End = { Line = 5 Character = 26 } }, Variable, 0); ({ Start = { Line = 5 Character = 26 } End = { Line = 5 Character = 28 } }, Variable, 0); ({ Start = { Line = 5 Character = 31 } End = { Line = 5 Character = 36 } }, Type, 0); ({ Start = { Line = 6 Character = 4 } End = { Line = 6 Character = 11 } }, Function, 0); ({ Start = { Line = 6 Character = 13 } End = { Line = 6 Character = 15 } }, Regexp, 0); ({ Start = { Line = 6 Character = 17 } End = { Line = 6 Character = 22 } }, Variable, 0)|] ```

The root of the problem is that the FCS range for this specific area is:

(6,11-6,28) "script.fsx" -> "{ PropA = propA }"

and it's tagged with the type 'value'. This is perhaps a bit broad, but we don't really get anything in terms of data structures to modify any logic with. We may need to change the ranges reported by FCS to include more things like keywords, or to be more precise for the case of deconstructed records in general.

MangelMaxime commented 3 years ago

Thinking aloud

Hum, but if the whole range is "{ PropA = propA }" I wonder why we don't have everything the same color.

Here we have some light blue, blue, yellow.

I didn't dig into how semantic works yet, I will need to learn more about it and it's relation with the textmate grammar.

baronfel commented 3 years ago

I wonder why we don't have everything the same color.

This is because of the additional processing I mentioned as step 2 in this comment. A bit of background on LSP semantic highlighting:

It relies on

The LSP spec is picky here: technically the spec allows for clients to support token spans with overlapping or multiline ranges, which would be what would happen if FSAC naively mapped and returned the ranges that we get for this code directly from FCS, namely we get the 'full' range of the record deconstruction in (6,11-6,28), but we also get ranges for the property name (6,13-6,18) and the variable mapping. This is the case in the list of tokens I provided in the example above.

The LSP spec allows for a client to say that they support these ranges, but VSCode does not. Therefore, FSAC reverts to the lowest-common-denominator: we attempt to deduplicate these ranges so that there are no nesting or overlaps. We do this through an algorithm that walks each range and finds the other ranges that this 'parent' range would contain partially or entirely within itself, and we slice out the overlapping portions from the parent range, so that each parent range becomes 1 or more sub-ranges. After this operation there should be no overlaps.

This is what you're seeing above returned from FSAC for the overall variable deconstruction:

 ({ Start = { Line = 5
                 Character = 11 }
       End = { Line = 5
               Character = 13 } }, Variable, 0);
    ({ Start = { Line = 5
                 Character = 13 }
       End = { Line = 5
               Character = 18 } }, Property, Readonly);
    ({ Start = { Line = 5
                 Character = 18 }
       End = { Line = 5
               Character = 21 } }, Variable, 0);
    ({ Start = { Line = 5
                 Character = 21 }
       End = { Line = 5
               Character = 26 } }, Variable, 0);
    ({ Start = { Line = 5
                 Character = 26 }
       End = { Line = 5
               Character = 28 } }, Variable, 0);

These ranges were derived from taking the entire deconstruction range, which was tagged as a Variable, and chunking out the overlapping sections of it.

This is why solving this will be hard: the parent range reported by FCS includes the brace syntax inside of it, but we aren't also given inner/child ranges that tag that syntax. The structure API overall doesn't really return ranges for keywords/syntax tokens like that that we could use to make it a 100% replacement for the textmate grammar, which is why we still have both. It could definitely be expanded to do that, though!