purcell / emacs-reformatter

Define commands which run reformatters on the current Emacs buffer
259 stars 21 forks source link

Perform a less invasive buffer content replacement operation #24

Open jimeh opened 4 years ago

jimeh commented 4 years ago

Borrowed from go-mode's gofmt and releated functions: https://github.com/dominikh/go-mode.el/blob/734d5232455ffde088021ea5908849ac570e890f/go-mode.el#L1850-L1956

Hence I'm not the original author of most this code, but I have used a number of borrowed variations of it over the years for hand-rolled formatting solutions.

It works by passing the current buffer content to diff via STDIN to compare it against the target file on disk producing a RCS formatted diff that is easy to parse. It then iterates through the said diff only modifying relevant lines to make the buffer be identical to the target file.

In my experience it does a much better job of maintaining cursor position compared to insert-file-contents, as it only modifies lines that got corrected, and leaves all other lines in the buffer untouched.

Performance is also excellent and near instant. On my late-2016 13-inch MacBook Pro with a Dual-Core i7 at 3.3Ghz, I can't really tell the difference between this solution and insert-file-contents.

I did also test it against the Haskell example from: https://github.com/purcell/reformatter.el/issues/19

Replacing buffer content of Req.hs with that of Req-reformatted.hs is most of the time just as fast as insert-file-contents (around 50ms), and only sometimes a little slower of around 100-150ms based on my totally unscientific measurement technique of guesstimating it. I find that rather impressive as the RCS-formatted diff itself is 17KB.

I also tested the old replace-buffer-contents-based variant to see how it performs on my machine. It took around 15 seconds.

purcell commented 4 years ago

Thanks! Yes, I've seen various incarnations of this RCS diff code in my elisp travels.

Any idea why this is so much faster than replace-buffer-contents? There must be a catch, presumably related to its ability to retain buffer context.

I'm not averse to applying something like this, but I'm also not super keen to vouch for the code's correctness or maintain it in perpetuity. 😃

wyuenho commented 4 years ago

Any idea why this is so much faster than replace-buffer-contents?

Maybe because the major mode doesn't have to reapply font lock to the entire file, imenu doesn't have to reindex and a whole bunch of other file-wide operations can be skipped?

purcell commented 4 years ago

@wyuenho I suspect from this comment that you are unfamiliar with replace-buffer-contents, which is a new function that explicitly aims to minimise such unnecessary modifications in the buffer while replacing its contents.

jimeh commented 4 years ago

@purcell I'm afraid I don't why replace-buffer-contents is so slow, I did have a quick look at the source code, but I know almost no C. If I had to guess though, maybe replace-buffer-contents tries to do as small modifications as possible, leading to tens of thousands of operations for Req.hs. While the RCS-diff approach ends up just operating on whole lines, yielding a lot fewer operations.

I obviously didn't write it, but I've used a variant of this in my own rubocopfmt.el for over three years now without any real issues, except the cursor being moved to the beginning of the current line if the current line is corrected. However it looks like that was fixed in go-mode two years ago, and is hence included in this PR.

Obviously a "works for me" isn't exactly the greatest vote of confidence ever, but it's better than nothing, and hopefully counts for something :)

As for maintaining it moving forward, I'd say it's probably fine to just re-borrow it from go-mode if anything changes over there. That's ignoring or assuming there's no licensing issues though.

Stability wise it looks pretty good too. Doing a git blame, the relevant functions have remained mostly unchanged for 7 years, with a minor fix 2 years ago:

purcell commented 4 years ago

Hey, so I've been spending a bit of time doing some design/planning for reformatter.el, and I'm revisiting this PR in that context. Here's what I'm thinking:

I feel like this route would give us the definitely-useful additional functionality needed for programs that output diffs, via the first step, while making it easy to add and test the second part. What do you think?

jimeh commented 4 years ago

@purcell Both of those changes sound good to me. Let me know if there's anything I can help with :)

lassik commented 3 years ago

We need the same thing in https://github.com/lassik/emacs-format-all-the-code/pull/124. What to do? It doesn't make sense for lots of Emacs packages to all carry their own RCS-patching and buffer-replacing code.

format-all and reformatter could possibly be refactored so that format-all definitions are written using reformatter, but that's a much bigger project. I think format-all supports some formatter quirks that are not yet in reformatter.

purcell commented 3 years ago

I'm still keen to move forward with this, but I don't have a personal use case and I've been very busy recently, so my time for open source has been extremely limited.

@lassik It's possible that making reformatter--do-region more of a public end point with stable and well-documented options could fit the bill for implementing format-all on top of reformatter.

lassik commented 3 years ago

See also https://github.com/raxod502/apheleia

purcell commented 3 years ago

I feel like a good plan might be to extract the rcs patch code from apheleia into a separate package we could all use. Perhaps @raxod502 would be up for splitting it out, to avoid us needing to do it unilaterally?

lassik commented 3 years ago

From a conceptual standpoint, shouldn't reformatter do what apheleia does, and then format-all be written on top of the new reformatter? To be sure, the practicalities take a bit of work.

lassik commented 3 years ago

live-preview has some code to replace a buffer's contents that has nothing to do with code formatters. So we should have a generic replace-buffer package.

On top of that, it would be great if apheleia and reformatter could be merged, and format-all could be based on that.

raxod502 commented 3 years ago

No objection to merging these various formatting packages, or to extracting common code into a library. I do not think I have the bandwidth to do so myself, but will certainly advise any such effort and will be happy to merge pull requests to Apheleia that may be necessary to support the work.

raxod502 commented 2 years ago

No objection to merging these various formatting packages

See also https://github.com/lassik/emacs-format-all-the-code/issues/170, by the way. (I have more time these days, and am revisiting the idea.)