beancount / beancount-mode

Emacs major-mode to work with Beancount ledger files
GNU General Public License v3.0
108 stars 32 forks source link

Feature Request - tree-sitter support #33

Open peterhoeg opened 1 year ago

peterhoeg commented 1 year ago

There is already tree-sitter grammar for beancount, so it would be great to get support for that. Whether or not that will be a supplementary mode to the existing regexp based mode doesn't really matter that much, but there seems to be a lot of momentum behind tree-sitter including in emacs upstream.

Ref: https://github.com/polarmutex/tree-sitter-beancount

dnicolodi commented 1 year ago

Why is this a feature request? Which feature would be enabled by switching the implementation of beancount-mode to use tree-sitter? Emacs with native support for tree-sitter has not been released yet and kinks in the tree-sitter support are still been ironed out. From the look of it so far, a beancount-mode based on tree-sitter would be implemented alongside the regexp based one as making the two implementation coexist would be too complex for very little gain.

A while ago, I had a look at Beancount grammar for tree-sitter you link to, and I was not positively impressed, see posts on the Beancount mailing list: the main issues are that it lacks support for understanding indentation, thus it cannot correctly parse the Beancount syntax, and the structure of the generated syntax tree is too close to the Yacc grammar it was derived from instead than offering logical nodes.

Before knowing of that effort I also wrote a tree-sitter implementation of the Beancount grammar. If there is going to be work in this repository to write an Emacs beancount-mode based on tree-sitter, it will almost certainly be based on that.

peterhoeg commented 1 year ago

Which feature would be enabled by switching the implementation of beancount-mode to use tree-sitter?

It isn’t really about specific features per se - at least not at this point.

It’s more about the fact that this is the direction in which emacs is moving and all the tooling around it due to increased semantic understanding.

It also means we would be getting folding support as well as text/movement operations for free (collapse transaction, next transaction, delete transaction).

And of course all the hard work that goes into the beancount grammar regardless of the editor of choice, means that we get to share with everyone else.

Emacs with native support for tree-sitter has not been released yet and kinks in the tree-sitter support are still been ironed out.

Sure, upstream emacs doesn’t support it yet, but there is absolutely already support for tree-sitter: https://github.com/emacs-tree-sitter/elisp-tree-sitter

From the look of it so far, a beancount-mode based on tree-sitter would be implemented alongside the regexp based one as making the two implementation coexist would be too complex for very little gain.

This is the same direction taken by the main modes that will be distributed with emacs 29+ - that makes perfect sense.

A while ago, I had a look at Beancount grammar for tree-sitter you link to, and I was not positively impressed, see posts on the Beancount mailing list: the main issues are that it lacks support for understanding indentation, thus it cannot correctly parse the Beancount syntax, and the structure of the generated syntax tree is too close to the Yacc grammar it was derived from instead than offering logical nodes.

I had a look through the list - thanks for the pointer.

Before knowing of that effort I also wrote a tree-sitter implementation of the Beancount grammar. If there is going to be work in this repository to write an Emacs beancount-mode based on tree-sitter, it will almost certainly be based on that.

Sure. I understand that there is some friction between the 2 competing implementations but I hope that can be worked out. The community will definitely benefit from having a single, officially “blessed” implementation.

dnicolodi commented 1 year ago

Do you realize that this sounds like you are asking people to do "hard work" (your words) for free because you decided that it is cool to use tree-sitter?

It also means we would be getting folding support as well as text/movement operations for free

You would get it for free only because someone puts in the hard work of coding it for you. All these functionalities are already available in the current beancount mode, and they are equivalently free to you as someone else already coded them.

Sure, upstream emacs doesn’t support it yet, but there is absolutely already support for tree-sitter:

So this hypothetical someone rewriting beancount-mode using a tree-sitter grammar, not only needs to do the work once targeting the emacs in-tree support for tree-sitter, but actually twice, also supporting the out-of-tree library.

I understand that there is some friction between the 2 competing implementations but I hope that can be worked out.

I don't understand why you describe providing constructive criticism (to which there has been no reply in the merit of the technical deficiencies highlighted) as "friction".

The community will definitely benefit from having a single, officially “blessed” implementation.

Why?

peterhoeg commented 1 year ago

Do you realize that this sounds like you are asking people to do “hard work” (your words) for free

Well, I know what I meant and that is definitely not it. I’m neither asking nor telling anyone what to do.

because you decided that it is cool to use tree-sitter?

While I fully admit that I find the work around recovering from syntax errors in TS very fascinating, “cool” has very little to do with this particular request.

It also means we would be getting folding support as well as text/movement operations for free

You would get it for free only because someone puts in the hard work of coding it for you.

Nobody is doing it “for me”.

When I say we get something for free, I mean that we (as a community) benefit from the overall improvement in tooling that can happen as a consequence of moving from a more simplistic understanding of the source (syntax only) to a semantic understanding. Linters and formatters are being built right now that also base themselves on TS. Source forges are building syntax highlighting based on TS.

It’s similar to the situation with LSP, where a single implementation is enough and every editor doesn’t have to reinvent the wheel.

All these functionalities are already available in the current beancount mode, and they are equivalently free to you as someone else already coded them.

Sure, but there are also existing issues presumably as a consequence of the existing regex based solution: https://github.com/beancount/beancount-mode/issues/3

I am not saying this will magically go away with a TS-based mode of course, but we (all users of editors with TS support) will benefit from improvements made to the beancount grammar.

Sure, upstream emacs doesn’t support it yet, but there is absolutely already support for tree-sitter:

So this hypothetical someone rewriting beancount-mode using a tree-sitter grammar, not only needs to do the work once targeting the emacs in-tree support for tree-sitter, but actually twice, also supporting the out-of-tree library.

First of all, nobody “needs” to do anything.

I have not looked at both implementations so I cannot comment on the amount of work that would go into supporting both. There may be a large overlap and there may not be. But it is of course also perfectly valid to say, “as a project we don’t have the resources to accommodate both, so we wait for emacs 29 to be released before committing to that (or anything for that matter)”.

I don’t understand why you describe providing constructive criticism (to which there has been no reply in the merit of the technical deficiencies highlighted) as “friction”.

I’m not a native English speaker so that may have been a poor choice of words. What I intended to communicate with “friction” was exactly that it seemed to have gone beyond merely a technical discussion and into personal territory.

The community will definitely benefit from having a single, officially “blessed” implementation.

Why?

To cut down on the duplicated efforts. One of the ideas behind pushing for TS is that a single piece of work is usable with huge range of editors and supplementary tools. Duplicating efforts doesn’t seem like it would benefit anyone.

But maybe I should have phrased this initial request as a question instead: What are the plans (if any) for benefiting from TS in beancount-mode? Is this even a goal and if so does anyone have plan for getting there?

dnicolodi commented 1 year ago

Well, I know what I meant and that is definitely not it. I’m neither asking nor telling anyone what to do.

You decided to title this issue "Feature Request". Any dictionary would tell you that a request is "the act or an instance of asking for something". Therefore, you are asking the maintainer of beancount-mode to do some work, and nowhere you offered to contribute to that work.

What I intended to communicate with “friction” was exactly that it seemed to have gone beyond merely a technical discussion and into personal territory.

Can you please provide a pointer to what you are referring to exactly? I don't remember any element of the discussion that was not purely on the technical merit of the two implementations. Maybe I missed it.

dnicolodi commented 1 year ago

Sure, but there are also existing issues presumably as a consequence of the existing regex based solution: #3

This is an issue in some code that is kept only because @blais, the original author of beancount-mode, may be still relying on it. That old code indeed uses some simple regular expressions that do not handle that well some corner cases. The functionalities that all other users of beancount-mode should use have no know issues of that kind. I've had on my todo list to rewrite these old functions in terms of the improved ones since a long time.

dnicolodi commented 1 year ago

While I fully admit that I find the work around recovering from syntax errors in TS very fascinating, “cool” has very little to do with this particular request.

As someone that does not contribute to the development of beancount-mode, why do you care if it is implemented using regular expressions, or a tree-sitter grammar, or anything else? I already asked you why this is issue is formulated as a feature request and you answered

It isn’t really about specific features per se - at least not at this point. It’s more about the fact that this is the direction in which emacs is moving and all the tooling around it due to increased semantic understanding.

So, it appears that the only motivation is that tree-sitter is "the direction in which emacs is moving", or in other words this is what is the current trend in Emacs, or in only one word, this is what is "cool" in the Emacs world right now. Again, a dictionary would tell you that the only meaning of "cool" that makes sense in this context is "conforming to the custom, fashion, or established mode".

dustinfarris commented 1 year ago

This is one ugly thread 👎

blais commented 2 months ago

I don't have an opinion that matters much on this topic (I try to spend the minimum amount of time on beancount-mode honestly), but I think what could be done is that if someone really wants this, sending an implementation based on an LSP that we could add would be welcome here. It's a large change, and I think a good starting point would be a separate file in this repo.