kynan / nbstripout

strip output from Jupyter and IPython notebooks
Other
1.19k stars 94 forks source link

Not compatible with `pre-commit-hooks/pretty-format-json` hook #185

Open rusmux opened 11 months ago

rusmux commented 11 months ago

When using this tool with pretty-format-json hook from https://pre-commit.com/hooks.html, when they both rerun after each other regardless of the order. I suppose one way to fix this is to make nbstripout compatible with pretty-format-json, i.e. it won't run again after pretty-format-json formatting

dokempf commented 10 months ago

Hey @rusmux, I was running into the same situation and took a deep dive at what is happening. Here are the results

The root of the issue is a change in the identify library that provides pre-commit with information about what files to apply what hooks to. Recently somebody added the information that an ipynb file is also a json file: https://github.com/pre-commit/identify/commit/cfd21885b0d3f1ae8a634af5c538c64884449132 If the identify version on your system (which is versioned completely independently from pre-commit and thus we have no control over from a hook) happens to include that commit, the pretty-format-json hook will run on notebooks although it previously did not. The result is that nbstripout and pretty-format-json step on each others toes.

The "proper" way to fix this would be to align the output of the two tools (they differ in indentation mostly), but I had a hard time reading nbstripout code to find the places where the actual formatting is happening. There is a stop gap measure though, which is to disable the pretty-format-json hook for jupyter files in your configuration like this:

repos:
  - repo: https://github.com/kynan/nbstripout
    rev: 0.6.1
    hooks:
      - id: nbstripout
        files: ".ipynb"

  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.4.0
    hooks:
      - id: pretty-format-json
        exclude_types:
          - jupyter
        args:
          - --autofix
rusmux commented 10 months ago

@dokempf Great job! I will then follow your example, thank you!

kynan commented 10 months ago

@dokempf Thanks for digging into this! The reason you didn't find the code where nbstripout does the formatting is that it doesn't 😃 It uses nbformat, so if you want to lobby for a change, you need to file a bug against nbformat.

dokempf commented 10 months ago

It seems like nbformat has discarded the idea a while back: https://github.com/jupyter/nbformat/issues/228