emacs-csharp / csharp-mode

A major-mode for editing C# in emacs
GNU General Public License v3.0
155 stars 48 forks source link

Using tree-sitter in csharp-mode? #201

Closed theothornhill closed 3 years ago

theothornhill commented 3 years ago

I have been following emacs-tree-sitter development for a while, and after watching https://emacsconf.org/2020/talks/23/, I feel we at least should talk about this feature being incorporated into csharp-mode after https://github.com/emacs-csharp/csharp-mode/issues/200 and numerous other issues.

What would be the benefits?

  1. No reliance on CC Mode
  2. Highlighting would be easier to configure
  3. Indentation could be provided (I think?)
  4. More?

This would be yet another rewrite - yay! However, csharp-mode is in a kind of good place, in that what it provides is mostly indentation and highlighting, leaving most of the smarts to lsp-mode. This is obviously a wishlist feature.

CC @ubolonton (Just letting you know this issue exists)

josteink commented 3 years ago

There has been similar discussions for typescript-mode too, although nobody has stepped up to do the actual work :+1:

https://github.com/emacs-typescript/typescript.el/issues/4

theothornhill commented 3 years ago

There has been similar discussions for typescript-mode too, although nobody has stepped up to do the actual work ๐Ÿ‘

emacs-typescript/typescript.el#4

I'll look i to it ๐Ÿ˜„

theothornhill commented 3 years ago

I've started to look into this, and have created a branch - tree-sitter to develop this feature. I based it on https://github.com/ubolonton/emacs-tree-sitter/issues/70.

What I've done so far:

To try this you need to use my fork of emacs-tree-sitter or wait until https://github.com/ubolonton/emacs-tree-sitter/pull/79 is merged

I think the simplest transition would be to keep the indentation provided by CC Mode until we have some tree-sitter-indentation implementation to rely on.

Yet another rework! :smile:

For context: image

Edit: This is fun :smile: image

theothornhill commented 3 years ago

More progress: image

josteink commented 3 years ago

I just tried running the branch, but it seems I'm doing something wrong.

When trying to activate it, I'm not getting any highlighting at all, I just instead the following in my logs:

File mode specification error: (error No language registered for major mode โ€˜csharp-modeโ€™)

Probably an easy fix, or a quick one-liner I'm missing, but it's not currently runnable OOB.

Edit: After reading some docs and some of your PRs against emacs-tree-sitter I found the following to work:

;;;###autoload
(add-to-list 'tree-sitter-major-mode-language-alist '(csharp-mode . c_sharp))

Should I add it to our branch for now, or should we just assume it will be merged upstream soon enough?

josteink commented 3 years ago

After playing slightly with this code, I'm very much sold on this.

All we have to do is tell Emacs how we want that language to look inside our major-mode. And that's really all we should be focusing on. Perfect!

For instance, I took note that attributes are rendered differently with and without arguments:

image

I looked at the (to me!) completely alien major-mode definition, found this:

image

Noticed and fixed an inconsistency:

image

And voila!

image

This really couldn't be easier.

Compared to any other Emacs major-mode I have worked with, this was by far the simplest fix. I'm all in awe!

So:

This almost has me rooting for a similar solution for typescript-mode!

My only concern/regret about such an approach is the dependency on native executables, and what (few) platforms they currently exist for, compared to the wide array of platforms Emacs itself supports.

But again, I guess that can be solved upstream, and if the use of tree-sitter-mode in major-modes start increasing, there will be a natural push, by more people than just me, for better platform support (*BSD, ARM64, etc).

theothornhill commented 3 years ago

Great! I'm so glad to hear that!

My only concern/regret about such an approach is the dependency on native executables, and what (few) platforms they currently exist for, compared to the wide array of platforms Emacs itself supports.

But again, I guess that can be solved upstream, and if the use of tree-sitter-mode in major-modes start increasing, there will be a natural push, by more people than just me, for better platform support (*BSD, ARM64, etc).

Yeah, this is an issue, and we need to find a good solution for this.

Right now I've just hurled everything into one spot, as you can see. I think we should factor this one out in separate variables and then just ,@ everything into the proper variable. It should be super easy for people to find inconsistencies (like you did - I'm sure there are lots of them) and make simple prs. No more relying on monkey patches etc.

Yeah, tree sitter is amazing. I'm looking into tree-sitter-indent as we speak as well :)

josteink commented 3 years ago

Just sent in 2 patches to your branch! Attributes and namespaces.

This shit is simple!

theothornhill commented 3 years ago

Just fire away! We can extract stuff later.

Some things I've noted:

This is intentional. Also:

theothornhill commented 3 years ago

I opened an issue here: https://codeberg.org/FelipeLema/tree-sitter-indent.el/issues/1 in this repo: https://codeberg.org/FelipeLema/tree-sitter-indent.el/

I feel like I'm missing something here, but I have a proof of concept for indentation also ๐ŸŽ‰ . I hope @FelipeLema can help us out here. If we can iron out some bugs (or teach me how to read) I think we can ditch CC Mode pretty soon(tm)

theothornhill commented 3 years ago

Right now this code indents correctly when typing it out manually


namespace Foo
{
    public class Bar
    {
        public Baz Quux()
        {
            // ...
            var test = new Generic();
            var text = new Thing.Thang
            {
                a,
                b,
                c
            };
            yield return new InnerA.InnerB
            {
                Boo = "foo",
                May = "Yay"
            };

        }
    }
}

Marking the whole thing and pressing TAB yields some strange behavior.

However, note that the yield part indents correctly! It doesn't on master, and required huge hacks (if I recall correctly) from before rework. So we are already harvesting benefits here. From very limited late night hacking ๐Ÿช“

theothornhill commented 3 years ago

@josteink If you want, you can test the branch now both for indentation and coloring, and add things as you see fit. Most of the infrastructure should be usable, and we have the same "config-based" rules for indentation!

One small tip: If something doesn't indent properly, try making it syntactically valid first I.E add a semi after a brace. See the issue over at codeberg for more details ๐Ÿ‘

Most of the code in indentation-tests indents correctly as you type. It's still rough around the edges and needs to be cleaned up of course :) Just letting you know the status

josteink commented 3 years ago

Found this gem here when doing a fat finger:

image

Hopefully not too difficult to figure out :)

theothornhill commented 3 years ago

oOo!

My guess is that this is pretty hard :P We have no logic for dealing with that, now at least. I'll look into this :)

Side note: Lots of changes coming :)

theothornhill commented 3 years ago

I added back your interface code. It was a hassle with the defcustoms, much easier to iterate on highlighting when everything is in one place. Sorry for the back and forth. We will factor it out when we are satisfied in the end.

theothornhill commented 3 years ago

Good news! I think we have feature parity, both with indentation and highlighting! ๐Ÿคž

Caveat: You need https://codeberg.org/theo/tree-sitter-indent.el/ (fork of FelipeLemas thing) for the latest changes. They haven't been merged yet. However, it should be a simple eval-buffer to get that.

Now indentation-tests doesn't alter buffer when I use C-x h TAB. I think that is a huge step forward!

One thing to note, which is a little weird:

Consider this code:

namespace Foo
{
|                                     // This is how things are opened - point is '|'
}

If I start typing something, then we get the extra indentation. I'm thinking of ways to get tree-sitter-indent to do this immediately, but haven't figured it out yet. However, if you type something, then press RET we have correct indentation, curtesy of electric-indent

Why don't you test it a little? I'm a little eager to get this merged into csharp-mode, documenting how to adjust highlighting etc so that people can start to contribute. I do want it to be functional though.

(I'm also thinking of stealing the indent part - it's only a couple hundred LOC and making it C# specific. However, I think the community could need some debugging on that thing to make it super robust ๐Ÿ˜„ )

theothornhill commented 3 years ago

Found this gem here when doing a fat finger:

image

Hopefully not too difficult to figure out :)

I don't think this is solvable: https://github.com/tree-sitter/tree-sitter-c-sharp/blob/master/grammar.js#L1471

Roslyn doesn't seem to acknowledge scandinavia, I'm afraid

josteink commented 3 years ago

Given how that is legal, compilable code (even though bad form), that seems a bit odd. Oh well.

Good researching ๐Ÿ™‚

theothornhill commented 3 years ago

Ok I stole the tree-sitter-indent. Now it works OOTB. Will still contribute upstream, but it is faster to iterate this way :)

intrasight commented 3 years ago

Question: does this require installing tree-sitter?

https://ubolonton.github.io/emacs-tree-sitter/installation/

josteink commented 3 years ago

Question: does this require installing tree-sitter?

https://ubolonton.github.io/emacs-tree-sitter/installation/

I believe our dependency on Emacs-tree-sitter should ensure we get a bundled version of the actual tree-sitter binary for all "common" platforms.

What OS are you using?

theothornhill commented 3 years ago

You donโ€™t need any external tree sitter related packages. Everything should be included with the tree sitter branch.

However, we depend on ubolontons mode as well

theothornhill commented 3 years ago

Closing after 3cff337

seagle0128 commented 3 years ago

Sorry guys! I got this error with the latest version if I didn't install tree-sitter.

Error loading autoloads: (void-variable tree-sitter-major-mode-language-alist)
theothornhill commented 3 years ago

Do you get the error if you manually install tree-sitter first? If not, Iโ€™ll update readme to be clearer about this behavior. For now I donโ€™t want to introduce dependencies to csharp mode.

jcs090218 commented 3 years ago

@seagle0128 Are you sure you are installing csharp-mode with the latest version? I have opened this issue at #208 and it should resolved by #209. You should probably see the following error log if you don't have tree-sitter installed.

Error (bytecomp): Cannot open load file: No such file or directory, tree-sitter
seagle0128 commented 3 years ago

Thanks, @theothornhill @jcs090218 .

Now I got this message while installing csharp-mode.

Cannot open load file: No such file or directory, tree-sitter
jcs090218 commented 3 years ago

@seagle0128 Notice that tree-sitter support is experimental and this isn't an official yet. You may have to manually install package tree-sitter to resolve this issue.

By seeing csharp-mode.el it does not have Package-Requires specify.

seagle0128 commented 3 years ago

Gocha, glad to know. Thanks, @jcs090218 .