atom / language-csharp

C# language support for Atom
Other
62 stars 52 forks source link

Use PCRE-compatible CSON grammar #116

Closed worldbeater closed 6 years ago

worldbeater commented 6 years ago

We finally did this! This CSON grammar is PCRE compatible and so can be used both in Atom and in GitHub web interfaces. An old JSON-based grammar is less readable and less maintainable, so I decided to switch to CSON. Sources were taken from here: https://github.com/dotnet/csharp-tmLanguage/tree/master/grammars

GitHub's highlighter was not highlighting source code correctly while PCRE does not support special characters in named group's names, such as a hyphen. So I got rid of them all using a simple Python script.

This pull request closes these issues while GitHub highlighting now works as expected: https://github.com/atom/language-csharp/issues/114 https://github.com/atom/language-csharp/issues/113 https://github.com/atom/language-csharp/issues/112

Special thanks to @hacklex for a great help with figuring out what was causing RegEx compatibility issues. It would be almost impossible to implement all these things without his assistance.

Thanks!

worldbeater commented 6 years ago

Here is how the highlighting works now. All bugs are gone, types and method calls get highlighted properly. Maybe some color tweaks will be needed, but anyway it looks great now!

image

damieng commented 6 years ago

That's great! Thanks for this.

I need a day or two to figure out how to get the atom and github versions to co-exist nicely in this repo - right now what you did will make github.com work but will break our Atom text editor as it isn't PCRE based and also uses this repo.

worldbeater commented 6 years ago

@damieng I suppose this grammar will also play well with Atom editor. The conversion we've made can be described as follows:

  1. JSON grammar was replaced with CSON one to improve readability. CSON supports line breaks, spacing and is less verbose. Javascript grammar uses CSON, so both GitHub and Atom should be fine with it https://github.com/atom/language-javascript
  2. Hyphens in named groups were completely removed. For example this: (?<type-args>\\s*<(?:[^<>]|\\g<type-args>)+>\\s*)? Became this: (?<typeargs>\\s*<(?:[^<>]|\\g<typeargs>)+>\\s*)?

And it worked. Perhaps, Oniguruma should also support this grammar, as named groups syntax is the same, but just without special characters in group names.

worldbeater commented 6 years ago

Just tested the new grammar in Atom editor by using atom --dev and apm develop language-csharp. I simply replaced an old JSON grammar with a newer one that does not contain hyphens in group names and that uses CSON format. This is the output:

image

This means an old grammar can be simply replaced with a newer one from this pull request and both GitHub and Atom highlighters would work as expected, no need for extra repositories!

damieng commented 6 years ago

Awesome! Thanks for this.

On Tue, Apr 17, 2018 at 9:57 AM Artyom notifications@github.com wrote:

Just tested the new grammar in Atom editor by using atom --dev and apm develop language-csharp. I simply replaced an old JSON grammar with a newer one that does not contain hyphens in group names and that uses CSON format. This is the output:

[image: image] https://user-images.githubusercontent.com/6759207/38884785-4b0cf120-4279-11e8-88da-e13737b51c33.png

This means an old grammar can be simply replaced with a newer one from this pull request and both GitHub and Atom highlighters would work as expected, no need for extra repositories!

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/atom/language-csharp/pull/116#issuecomment-382066567, or mute the thread https://github.com/notifications/unsubscribe-auth/AAHQp_iSFkeiE_472CD6fow9u4LxLW_cks5tph7ygaJpZM4TXP8c .

ghost commented 6 years ago

@worldbeater, thank you! Would it make sense that we add this script to upstream if @DustinCampbell and @damieng agree as well, so there is one source of truth or point of failure? :)

damieng commented 6 years ago

I was going to suggest to Dustin that perhaps we just run it one-time upstream to convert the names to not have dashes and maintain it like that going forward so it continues to work for github.com as well as all the other editors.

The regex named groups are purely internal so it shouldn't have any affect on anyone else.

damieng commented 6 years ago

I'm still seeing some issues with linguist - specifically things that should be entity names are coming through as keywords. Did you base the grammar on the one here in the repo or one upstream?

For comparison:

Linguist

image

Atom

image

damieng commented 6 years ago

Okay, I guess you took from upstream - I had to apply our two lines of changes we need on top to get the entity name highlighting working.

Lightshow now showing correctly :)

image

worldbeater commented 6 years ago

@damieng very nice! The only thing left that bothers me is that for function or constructor parameters type names and variable names are highlighted using the same color now.

image

Is there anything we can do with this? Maybe for parameter names black color can be used, not sure exactly. But also I don't think this is very important :)

damieng commented 6 years ago

I'll take a look at that tonight. I think I know what's causing it and it will be another tweak on our side.

worldbeater commented 6 years ago

Maybe this can help. There are four lines in this grammar that are responsible for parameters highlighting. Each line uses entity.name.variable.parameter.cs name: method parameters, named arguments, anonymous method expressions and lambda parameters. In language-java repository they use variable.parameter.java, so we can use variable.parameter.cs. I'll test it to see what will happen, please hold on.

worldbeater commented 6 years ago

Well, if we use language-java approach, we will get this fancy highlighting. I'll make a PR for this soon, hope this could help you. Thanks :) image UPD: Opened a PR. So the only thing left is making fields highlighting using black color or so. UPD2: Now it uses light orange color for variables, arguments and private fields. UPD3: Finally we decided to use black color for vars and fields.