DevShell native pre commit hooks

blaggacao commented 4 years ago

My first intuition was to claim a tight integration with pre-commit would be the way to go. Sleeping over it I think differently, hence this very issue comes to be. It proposes to consider the possibility of a shortcut and aspire a clear cut, simple and clean re-implementation.

pre-commit essentially does three things:

package tools
implement git hooks
manage workdir (thanks to @roberth for pointing this out)

Alarm! Doesn't nix package tools? — And is arguably better at doing so?!? Since, we are left with git hooks and workdir management. A typical pre-commit pre-push hook is relatively straight forward:

pre-push

```python #!/nix/store/n8nviwmllwqv0fjsar8v8k8gjap1vhcw-python3-3.7.6/bin/python3 """File generated by pre-commit: https://pre-commit.com""" from __future__ import print_function import distutils.spawn import os import subprocess import sys # work around https://github.com/Homebrew/homebrew-core/issues/30445 os.environ.pop('__PYVENV_LAUNCHER__', None) HERE = os.path.dirname(os.path.abspath(__file__)) Z40 = '0' * 40 ID_HASH = '138fd403232d2ddd5efb44317e38bf03' # start templated CONFIG = '.pre-commit-config.yaml' HOOK_TYPE = 'pre-push' INSTALL_PYTHON = '/nix/store/n8nviwmllwqv0fjsar8v8k8gjap1vhcw-python3-3.7.6/bin/python3.7' SKIP_ON_MISSING_CONFIG = False # end templated class EarlyExit(RuntimeError): pass class FatalError(RuntimeError): pass def _norm_exe(exe): """Necessary for shebang support on windows. roughly lifted from `identify.identify.parse_shebang` """ with open(exe, 'rb') as f: if f.read(2) != b'#!': return () try: first_line = f.readline().decode('UTF-8') except UnicodeDecodeError: return () cmd = first_line.split() if cmd[0] == '/usr/bin/env': del cmd[0] return tuple(cmd) def _run_legacy(): if __file__.endswith('.legacy'): raise SystemExit( "bug: pre-commit's script is installed in migration mode\n" 'run `pre-commit install -f --hook-type {}` to fix this\n\n' 'Please report this bug at ' 'https://github.com/pre-commit/pre-commit/issues'.format( HOOK_TYPE, ), ) if HOOK_TYPE == 'pre-push': stdin = getattr(sys.stdin, 'buffer', sys.stdin).read() else: stdin = None legacy_hook = os.path.join(HERE, '{}.legacy'.format(HOOK_TYPE)) if os.access(legacy_hook, os.X_OK): cmd = _norm_exe(legacy_hook) + (legacy_hook,) + tuple(sys.argv[1:]) proc = subprocess.Popen(cmd, stdin=subprocess.PIPE if stdin else None) proc.communicate(stdin) return proc.returncode, stdin else: return 0, stdin def _validate_config(): cmd = ('git', 'rev-parse', '--show-toplevel') top_level = subprocess.check_output(cmd).decode('UTF-8').strip() cfg = os.path.join(top_level, CONFIG) if os.path.isfile(cfg): pass elif SKIP_ON_MISSING_CONFIG or os.getenv('PRE_COMMIT_ALLOW_NO_CONFIG'): print( '`{}` config file not found. ' 'Skipping `pre-commit`.'.format(CONFIG), ) raise EarlyExit() else: raise FatalError( 'No {} file was found\n' '- To temporarily silence this, run ' '`PRE_COMMIT_ALLOW_NO_CONFIG=1 git ...`\n' '- To permanently silence this, install pre-commit with the ' '--allow-missing-config option\n' '- To uninstall pre-commit run ' '`pre-commit uninstall`'.format(CONFIG), ) def _exe(): with open(os.devnull, 'wb') as devnull: for exe in (INSTALL_PYTHON, sys.executable): try: if not subprocess.call( (exe, '-c', 'import pre_commit.main'), stdout=devnull, stderr=devnull, ): return (exe, '-m', 'pre_commit.main', 'run') except OSError: pass if os.path.isfile('/nix/store/1kfw43by83rd2ri483vqbd32srm4v45d-pre-commit-1.21.0/bin/pre-commit') and os.access('/nix/store/1kfw43by83rd2ri483vqbd32srm4v45d-pre-commit-1.21.0/bin/pre-commit', os.X_OK): return ('/nix/store/1kfw43by83rd2ri483vqbd32srm4v45d-pre-commit-1.21.0/bin/pre-commit', 'run') if distutils.spawn.find_executable('pre-commit'): return ('pre-commit', 'run') raise FatalError( '`pre-commit` not found. Did you forget to activate your virtualenv?', ) def _rev_exists(rev): return not subprocess.call(('git', 'rev-list', '--quiet', rev)) def _pre_push(stdin): remote = sys.argv[1] opts = () for line in stdin.decode('UTF-8').splitlines(): _, local_sha, _, remote_sha = line.split() if local_sha == Z40: continue elif remote_sha != Z40 and _rev_exists(remote_sha): opts = ('--origin', local_sha, '--source', remote_sha) else: # ancestors not found in remote ancestors = subprocess.check_output(( 'git', 'rev-list', local_sha, '--topo-order', '--reverse', '--not', '--remotes={}'.format(remote), )).decode().strip() if not ancestors: continue else: first_ancestor = ancestors.splitlines()[0] cmd = ('git', 'rev-list', '--max-parents=0', local_sha) roots = set(subprocess.check_output(cmd).decode().splitlines()) if first_ancestor in roots: # pushing the whole tree including root commit opts = ('--all-files',) else: cmd = ('git', 'rev-parse', '{}^'.format(first_ancestor)) source = subprocess.check_output(cmd).decode().strip() opts = ('--origin', local_sha, '--source', source) if opts: return opts else: # An attempt to push an empty changeset raise EarlyExit() def _opts(stdin): fns = { 'prepare-commit-msg': lambda _: ('--commit-msg-filename', sys.argv[1]), 'commit-msg': lambda _: ('--commit-msg-filename', sys.argv[1]), 'pre-merge-commit': lambda _: (), 'pre-commit': lambda _: (), 'pre-push': _pre_push, } stage = HOOK_TYPE.replace('pre-', '') return ('--config', CONFIG, '--hook-stage', stage) + fns[HOOK_TYPE](stdin) if sys.version_info < (3, 7): # https://bugs.python.org/issue25942 def _subprocess_call(cmd): # this is the python 2.7 implementation return subprocess.Popen(cmd).wait() else: _subprocess_call = subprocess.call def main(): retv, stdin = _run_legacy() try: _validate_config() return retv | _subprocess_call(_exe() + _opts(stdin)) except EarlyExit: return retv except FatalError as e: print(e.args[0]) return 1 except KeyboardInterrupt: return 1 if __name__ == '__main__': exit(main()) ```

It corresponds to a per-hook-configuration-interface roughly in the lines of:

entry: /nix/store/s4vwm64km5xc6glq69fa2dn6s0fl57dv-nixpkgs-fmt-0.6.1/bin/nixpkgs-fmt
exclude: ^$
files: \\.nix$  # include filter regex - include when a certain git diff touches one of those files
id: nixpkgs-fmt
name: nixpkgs-fmt
pass_filenames: true  # whether to pass touched filenames as argument to the tool
types:  # include filter using identity lib - this is interesting and convenient
 - file

In addition to this, one can selectively skip hooks by passing SKIP="nixpkgs-fmt,...".

It is also a handy feature to manually execute those configurations against all files (irrespective of them being touched by a certain commit range, but respecting the include / exclude filters) in order to do ad-hoc or general cleanups. The CLI interface for this task is bad, though: devshell cold attempt to do better.

[pre-commit]
[[nixpkgs-fmt]]
entrypoint = "$DEVSHELL_DIR/bin/nixpkgs-fmt ..."
exclude = ["^.ignore.me.nix$"]
include = ["(\\w|\\/)+?\\.nix$"]
types = [ "files", "nix" ]
pass_filenames = true

[pre-push]
[[go-lint]]  # name
entrypoint = "$DEVSHELL_DIR/bin/go-lint? ..."
types = [ "go" ]

### 🔨 Welcome to mkDevShell ####

# Commands

devshell-menu - print this menu
devshell-root - change directory to root
pre-commit.nixpkgs-fmt   - used to format Nix code
pre-push.go-lint   - used to format Go code

roberth commented 4 years ago

I'd recommend to use pre-commit.com because it is a mature tool that does a lot of things for you, managing the worktree before and after the hook invocations. Maybe I'm not reading your motivation correctly as to which parts you consider to be bad, but I agree that pre-commit-hooks.nix is doing too much, specifically pinning, packaging and some unnecessary wiring. Suppose I factor out a clean core of it into its own repo pre-commit.nix. Would that help you?

By all means feel free to experiment and create something new, but for a head start this is my recommendation.

blaggacao commented 4 years ago

managing the worktree before and after the hook invocations.

That might indeed be an argument and the only not-so trivial part.

The things I don't like are rather related with pre-comit itself:

the standard install does pre-commit hooks — more than one colleague rebuked the tool for it
setting it to other hooks is very verbose — global stage vs. hook level stage, wtf?
hooks have to be installed manually — each stage inidividually
pre-commit.nix has found a way to work around it's greedy nature as a packaging tool, but it's rather a hack around a system that was only conceived for this use case for local development. pre-commit, by nature, still aims to be a packaging tool. We could ask the maintainer to factor out the workdir managing logic. I might imagine the answer would be: go away!
the way to manually run hooks is extremely verbose: pre-commit run --hook-stage manual --all-files <hookid>
and you can't pass any arguments to manual runs, for example a specific file (this is really annoying - especially if the formatting or linting command has a non-trivial entrypoint one can't remember easily) — manual runs acquire task runner characteristics by their very nature. The author is opposed to this idea.

blaggacao commented 4 years ago

It is also hard to work around it's not-cli-friendliness.

Cutting through all the clutter arranges for circumstances which prevent https://github.com/cachix/pre-commit-hooks.nix/pull/52 — which is actually ready — reasonably from being merged.

blaggacao commented 4 years ago

Re/ workdir management, it looks at first sight, we are looking at this reasonably short piece of code: https://github.com/pre-commit/pre-commit/blob/master/pre_commit/staged_files_only.py

roberth commented 4 years ago

Well I'm happy I was able to keep my usage of pre-commit simple. I haven't desired to use the features you're having issues with. I wonder whether those are desired by other users of devshell.

the standard install does pre-commit hooks

You could use it merely as a back-end and apply your own opinion here

setting it to other hooks is very verbose

Not in your frontend I suppose

pre-commit.nix has found a way to work around it's greedy nature as a packaging tool, but it's rather a hack around a system that was only conceived for this use case for local development

It works. I agree splitting pre-commit.com doesn't seem feasible, but I don't think it's necessary.

the way to manually run hooks is extremely verbose

Not with your frontend. You don't need to recommend the pre-commit CLI. Possibly not even required to expose it.

and you can't pass any arguments to manual runs

same

Cutting through all the clutter arranges for circumstances which prevent cachix/pre-commit-hooks.nix#52 — which is actually ready — reasonably from being merged.

A more pragmatic approach might have cut that short by quite a bit and indeed the real problem has been solved. It's only stuck because it's trying really hard to control everything for everyone, monorepo style. That can be improved by factoring out the core.

blaggacao commented 4 years ago

You could use it merely as a back-end

That is true, though for devshell's ambitions that also might be risky given how upstream pre-commit is "architectured". And from my experience, I don't believe in much collaboration for the backend use case from upstream.

roberth commented 4 years ago

Alright, sounds like you'll be building a lightweight hook manager. Exciting! I hope it will be usable by itself, without having to adopt all of devshell. That way I can use it from projects that are plain Nix and I'll be able to contribute.

blaggacao commented 4 years ago

Quick research material:

zimbatm commented 4 years ago

In my experience, by the time the commit happens, it's already too late as I am ready to move onto the next thing. Another issue is that it often adds 1-2 second lags which is annoying in that context, especially during git-rebase. That's why I am not a big fan of git hooks.

I think a better approach would be to define an interface that can be hooked into the $EDITOR. For example, there could be a fmt <PATH> command that formats the edited file. Calling this on save should be much faster because there is only a single file to handle.

Building this fmt project is a bit of work so I have been holding off to it. It would have similar features as pre-commit run, with similar needs in terms of path to tool mapping. I would like to write it in Go or Rust to make it fast by default.

The nice thing about fmt is that it's orthogonal to the devshell. All the devshell needs to do is bring it in.

zimbatm commented 4 years ago

It's all about interfaces. Once you have fmt, the git hook is also possible.

blaggacao commented 4 years ago

In my experience, by the time the commit happens, it's already too late as I am ready to move onto the next thing. Another issue is that it often adds 1-2 second lags which is annoying in that context, especially during git-rebase. That's why I am not a big fan of git hooks.

Just to support the point: this is why more than one collegue rebuked pre-commit's default. I'm working around by vetoing pre-commit and doing all checks (and also more expensive ones) on pre-push. That's a solid middle ground to catch oversights before exposing one's work remotely.

blaggacao commented 4 years ago

I think a better approach would be to define an interface that can be hooked into the $EDITOR. For example, there could be a fmt command that formats the edited file. Calling this on save should be much faster because there is only a single file to handle.

This indeed solves about 70% of the use cases better!

In addition to that some might want enforce a go test ... or droneci run ... before pushing as part of their shift left effort.

So the tagword is shift left. Not shift leftest as @zimbatm is correctly pointing out with the comment on the pre-commit hook.

Therefore I am going to close this issue and reopen under shift left since a more concisely defined use case will probably yield better solutions.

blaggacao commented 4 years ago

@roberth I think with #18, we can safely narrow the requirement down to pre-push hooks and an ortogonal task runner.

It seems to me as if in the context of the (superior) editor hook formatting approach, the need for workdir management (clear text: stash away dirty workdir for applying transformations) can be relatively safely elided.

This is even true for code generators (like used exetensively in go) since I do not know of a case where a code generator does not exclusively target files that are entirely owned by this code generators. That means that workdir conflicts either do not occur or should be always resolved in favor of the code generator (aka override).

Hence, I will going to open another issue to exclusively discuss a super leightweight integration for pre-push hooks into devshell.

roberth commented 4 years ago

Alright, seems like you have everything covered in your new plan. It's quite different from my existing setup that I'm content with, so I don't think I can offer much help from this point onward.

blaggacao commented 4 years ago

Well, in reality, all I'm trying to do is to put myself at the service of this hidden strategic imperative:

THE

I have to say I'm not using devshell yet, but I clearly see it's potential.

For the strategic imperative to succed, I also think it involves being deliberate and opinionated and question everything in existance to the bottom. I just hope I'm asking the right questions and drawing correct draft-answers 😉

blaggacao commented 4 years ago

@zimbatm re fmt.

I very much like how the author of shfmt seemed to have understood that .editorconfig was conceived to at least not discourage it's use for other configuration tooling. At the service of dotfile-decluttering.

yajo commented 2 years ago

I've found a simple way to wire pre-commit and devshell:

Provide dependencies in your devshell, as usual.
Add a .pre-commit-config.yaml file where 👉🏼 all hooks use repo: local and language: system 👈🏼.
nix-shell --command 'pre-commit install'
You get bonus points if you enable direnv.

This way, pre-commit will do no package management. Instead, devshell and nix will do that. Even pre-commit itself will also be managed by them. However, still pre-commit will handle git hooks and workdirs.

Once you have fmt, the git hook is also possible.

With all this in place, fmt would just be an alias for pre-commit run --files $file_just_saved_in_your_editor. That is easy to achieve with direnv and nixpkgs' writeScriptBin.

blaggacao commented 2 years ago

In my opinión, the pre-commit tool is just a nuisance and distractionful abstraction layer and not useful at all.

I adopted direct bash scripts, like so:

https://github.com/input-output-hk/devshell-capsules/blob/main/flake.nix

With: https://github.com/input-output-hk/devshell-capsules/blob/main/pre-commit.sh

zimbatm commented 2 years ago

Recently I also added https://github.com/numtide/treefmt/blob/master/contrib/pre-commit which is also a pure bash solution.

blaggacao commented 2 years ago

... which leads me to an idea... maybe treefmt could be overcharged with also running something akin to eclint (editorconfig linter). It's general formatting, but somehow perpendicular to language specific formatting.

numtide / devshell

DevShell native pre commit hooks #16

pre-commit essentially does three things: