lassik / emacs-format-all-the-code

Auto-format source code in many languages with one command
https://melpa.org/#/format-all
MIT License
601 stars 105 forks source link

Meta-issue about project-specific / buffer-local formatters #33

Open lassik opened 5 years ago

lassik commented 5 years ago

This issue keeps appearing in many different guises (#25, #29, #30) and also in other Emacs packages. There is clearly a lot of demand among Emacs users for for project-local executables and settings.

So Emacs finds executables using the exec-path variable, which is set to be the same as the PATH environment variable at Emacs startup. It can be changed later as often as you like.

The problem is, exec-path is a global variable. So if you have 200 buffers open in Emacs, they all must share the same PATH. This means that they will all use the same versions of code formatters and other tools (e.g. the command prettier will find the same version of Prettier in all buffers).

To switch paths and other settings when moving to a new project, there are environment managers for Unix shells. These have names like rbenv, pyenv and phpenv. A new tool, direnv, was made as a generic, language-agnostic version of the same thing.

There is an Emacs package, emacs-direnv, to set the exec-path and process-environment variables from the results of running direnv in a subshell. There are two problems with this. First, many people still use other environment managers like rbenv, pyenv and phpenv, as the direnv homepage acknowledges, so we have to support those as well. Second, though emacs-direnv does its best to keep exec-path and process-environment, up to date, those are global variables so they are going to get out of sync unless direnv is run every time you switch to a new buffer (i.e. basically all the time, often several times per minute). Other similar tools (emacs-rbenv, emacs-pyenv etc. -- do those exist?) would encounter this same problem as emacs-direnv.

I think that to solve the problem neatly we would need buffer-local versions of the exec-path and process-environment variables. They could be called e.g. local-exec-path and local-process-environment. (The existing global variables could also be made buffer-local and keep the same names they now have. The problem is, they have been global variables for decades, so something may break if they are made local. An Emacs guru, someone with much more experience than me, would have to think about this.) Then emacs-direnv, emacs-rbenv and the others could run just once when opening a file, and wouldn't need to run at every buffer switch. And they wouldn't accidentally disturb the settings of other buffers if you forget to run them.

I emailed Steve Purcell (who is a prominent Emacs guru, responsible for many/most MELPA code reviews) about this, but it seems he is taking a well-deserved vacation from Emacs. If you have ideas about other Emacs gurus to contact, please tell. Maybe we should just ask on the Emacs mailing list. But I'd rather hand off the project to someone else since I already have quite a lot on my plate and don't have enough Emacs experience to solve it well.


EDIT: For reference, here are links to all the aforementioned projects:

direnv

rbenv

pyenv

phpenv

wbolster commented 5 years ago

fwiw, direnv-mode takes care of most of these issues, without making variables buffer local.

(emacs-direnv author here)

lassik commented 5 years ago

Thanks for stopping by, I didn't realize you wrote it :) I read the code and you are doing a good job under the constraints set by Emacs, but you had to add a post-command-hook in order to make it work. It's a lot of processing to fork a sub-shell for each Emacs command. Plus we can expect that some people still want to keep using other environment managers. For these reasons it would be good if Emacs came with a standard way to set buffer-local paths and other envars. emacs-direnv, rbenv, pyenv, etc. could then hook into that system.

wbolster commented 5 years ago

i think there's a misunderstanding here. yes, emacs-direnv can optionally automatically update the environment all the time. it's called direnv-mode, and indeed, it uses a post-command-hook.

however, it will only run direnv (by spawning a subprocess) if the directory of the currently visited buffer changes. in practice, that means this will only happen when changing buffers, which is not even close to ‘forking a sub-shell for each emacs command’, as you wrote. so in real-world scenarios the overhead is negligble except when switching buffers. conceptually it's exactly the same as typing cd in a shell with direnv enabled (outside emacs).

also, direnv works well with things like pyenv. a tool like pyenv can manage multiple python installations on a single system. direnv can simply configure which one to use, which is orthogonal, and not overlapping behaviour.

lassik commented 5 years ago

I stand corrected. Sorry that I misrepresented your project. I looked though the code in too much of a hurry and did not realize that it only runs a shell when the directory changes.

It's great that direnv can work together with pyenv and rbenv. However, my point was that if people want to use only one of those then I (as an author of a general purpose Emacs package) am not really in a position to tell them to use something else (no matter what their reasons are for using whatever they use - often the reason is simple familiarity and overwhelm from too many choices, and it's a very understandable part of modern life). I don't think we disagree here, we were just talking about different things :)

What would be great is an Emacs-wide solution that works for all of direnv, pyenv, rbenv as well as unrelated packages like format-all that just want to run commands.

wbolster commented 5 years ago

basically you would want some "pre-exec" hook to change the environment (mostly PATH).

direnv comes close but is not exactly the same. i argue it's the better option though: the price for figuring out the env is paid once per buffer switch, not once per executed program.

lassik commented 5 years ago

The thing I'm trying to get at is that there are two different kinds of actors here:

You could call these producers and consumers of environments. The thing would be to invent some kind of meeting point in the middle - some kind of variable that the producers would write and the consumers would read. People could then mix whichever producers and consumers they like.

lassik commented 5 years ago

Hey, there was a long thread about this on the official emacs-devel mailing list in July 2016: Managing environments (Python venv, guix environment, etc.)

lassik commented 5 years ago

That thread was very informative but never came to a resolution (I asked one of the Emacs devs and no concrete solution appears to be in the works currently). I'll start a new thread on emacs-devel about it as soon as I can find the time (swamped with other Lisp/Scheme tasks currently). Unless someone beats me to it - once again I'm no better qualified than anyone else to do it :)

strayer commented 5 years ago

FWIW I can say that poetry (and possibly pipenv) works great in combination with direnv.

I followed this documentation to make direnv set the virtualenv managed by poetry automatically: https://github.com/direnv/direnv/wiki/Python#poetry

Basic steps:

  1. add the layout function mentioned in the link above to ~/.direnvrc
  2. echo layout_poetry > .envrc in the project dir (& run direnv allow, obviously)
  3. restart emacs to get updated buffers
  4. enjoy black-formatted Python files with locally installed black without any hacks! 🎉
lassik commented 5 years ago

@purcell mentioned that he is experimenting with some lightweight Emacs-wide solution to project-local executables but I don't know how far along it is.

purcell commented 5 years ago

Yup, I've been on vacation and have just started a new job, so it's been slow progress, but the good news is that I'll be working a lot with Nix, which is going to motivate me to work on this.

lassik commented 5 years ago

Take as much time as you need Steve. Emacs maintenance and development largely rests on the efforts of people who are more or less fatigued by the work after a while. It's important that we stay well rested and have plenty of time for other things.

purcell commented 5 years ago

Thanks, and yes, that's very true. I've largely come to peace with the fact that I'm constantly a few weeks behind with MELPA work, for example, and it has been helpful to find that people are mostly quite understanding.

lassik commented 5 years ago

It's the old maxim: good, fast, cheap - pick any two. Since we pick good and cheap, things are slow :) Some open source projects pick fast and cheap, and that's a decent trade-off for many people. It's not possible to have all three without either that push of initial excitement about a new project or a professionally staffed organization. Maybe it's best that we accept that editors are tools for getting other work done, and it's a sign of good health that most of us eventually focus on that other work. There are always going to be a small handful of people who desire to work mostly on editors all their life but their numbers are too small to sustain a community.

venikx commented 3 years ago

Linking to a comment of mine on the doom-emacs repository, which is hopefully somewhat helpful: https://github.com/hlissner/doom-emacs/issues/3900#issuecomment-769474189.

It seems that when using https://github.com/purcell/envrc to load a nodejs nix environment format-all thrown errors, because it can't find "node". So I'm wondering when and in what context the installation (see snippet below) of eslint happens, since it doesn't seem to respect environment which gets loaded by the direnv, while using prettier-js seems to work fine.

https://github.com/lassik/emacs-format-all-the-code/blob/498cf73ddcf9f3c8b9400b16a5ee1272fb95e326/format-all.el#L745

purcell commented 3 years ago

The doom issue is probably related to https://github.com/purcell/envrc/issues/12

Essentially, with envrc-mode you have buffer-local values for process-environment and exec-path that reflect the user's preferences, but I expect that format-all uses call-process in a fresh buffer created by with-temp-buffer or similar, and those buffer-local values will not be inherited by the new buffer. I found this to be quite a counter-intuitive behaviour on the part of Emacs.

The ideal solution is for format-all to explicitly propagate the user's desired buffer-local values. I arranged for magit to be patched in this manner, for example: see magit--with-temp-process-buffer. This isn't particularly specific to envrc, by the way: it's quite common in other libraries to have to explicitly copy such buffer-local variables into new buffers so they can take effect as expected, and I've patched a couple of other libraries accordingly (intero for haskell comes to mind).

In the Doom issue I mentioned a user-level workaround, but obviously that's less ideal than tweaking format-all itself.

purcell commented 3 years ago

(I'd have to debug things to know whether my guess above is actually what's going wrong.)

lassik commented 3 years ago

I expect that format-all uses call-process in a fresh buffer created by with-temp-buffer or similar

Yes, that's exactly what it does - call-process-region specifically. Code link

Essentially, with envrc-mode you have buffer-local values for process-environment and exec-path that reflect the user's preferences [...]

The ideal solution is for format-all to explicitly propagate the user's desired buffer-local values. I arranged for magit to be patched in this manner, for example: see magit--with-temp-process-buffer.

I haven't patched process-environment and exec-path because Emacs doesn't make them buffer-local variables in the default install, so though it seems to work fine, I assumed something arcane might break in unexpected ways.

If magit overrides them, that's a strong endorsement that it works well, and I'm willing to patch them in format-all as well.

This isn't particularly specific to envrc, by the way: it's quite common in other libraries to have to explicitly copy such buffer-local variables into new buffers so they can take effect as expected, and I've patched a couple of other libraries accordingly (intero for haskell comes to mind).

If it's this common, should we have a small utility library that implements these features correctly? Called localenv or something. It could provide localenv--with-temp-buffer and localenv--call-process and the like, which are drop-in replacements for those standard ELisp functions.

lassik commented 3 years ago

@venikx format-all doesn't install any formatters itself; the installation commands are just hints shown to users in case a formatter is not found.

venikx commented 3 years ago

@lassik Oh, that's why I couldn't really figure out where it was calling the installation. I can work around it for now, thanks to the snippet from @purcell I don't need to rely on prettier-js and can just use format-all.

purcell commented 3 years ago

If it's this common, should we have a small utility library that implements these features correctly? Called localenv or something. It could provide localenv--with-temp-buffer and localenv--call-process and the like, which are drop-in replacements for those standard ELisp functions.

That's probably overfactoring, but potentially yes, for with-temp-buffer at least. Note that localenv--call-process is not something you could provide, because call-process is often called from a fresh buffer, so by that point you've already lost any connection to the original buffer.

lassik commented 3 years ago

Good point.

If process-environment and exec-path are the most important variables to inherit from the source buffer into the temp buffer, I can make format-all do that. If you or someone else ever gets inspired to make a helper package to handle this stuff, I can switch to it.

purcell commented 3 years ago

If you or someone else ever gets inspired to make a helper package to handle this stuff, I can switch to it.

I ended up writing inheritenv for this, should be on MELPA shortly. https://github.com/purcell/inheritenv

lassik commented 3 years ago

@purcell Thank you for getting to it! format-all now uses it.

A related issue is local vs remote processes (I believe Tramp does some magic to automatically call remote processes from buffers that edit remote files, which may or may not be what one wants).