mwouts / jupytext

Jupyter Notebooks as Markdown Documents, Julia, Python or R scripts
https://jupytext.readthedocs.io
MIT License
6.65k stars 386 forks source link

configure pre-commit to automatically transform all notebooks into py:percent #1078

Closed choucavalier closed 1 year ago

choucavalier commented 1 year ago

hi! thanks for this awesome project :) so much has been done since you first called me into your office (before jupytext was even born) to ask for my advice! it's really impressive

now i come for your advice!

i'm trying to set up pre-commit to have all notebooks under ./notebooks/ synced with a py:percent .py file

i ended up setting things up like this

repos:
  - repo: https://github.com/pre-commit/mirrors-mypy
    rev: v1.3.0
    hooks:
      - id: mypy
        args: [--python-version, "3.11"]
  - repo: https://github.com/astral-sh/ruff-pre-commit
    rev: v0.0.270
    hooks:
      - id: ruff
        args: [--fix, --exit-non-zero-on-fix]
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.4.0
    hooks:
      - id: trailing-whitespace
  - repo: local
    hooks:
      - id: jupytext-pair-notebooks
        name: jupytext-pair-notebooks
        entry: jupytext --set-formats ipynb,py:percent
        files: \.ipynb
        stages: [commit]
        language: python
        additional_dependencies:
          - jupytext==1.14.4
  - repo: https://github.com/timothycrosley/isort
    rev: 5.12.0
    hooks:
      - id: isort
  - repo: https://github.com/ambv/black
    rev: 23.3.0
    hooks:
      - id: black
  - repo: https://github.com/mwouts/jupytext
    rev: v1.14.4
    hooks:
    - id: jupytext
      args: [--from, "py:percent", --to, ipynb, --sync]
      additional_dependencies:
        - black==23.3.0
      files: notebooks/\.py
  - repo: https://github.com/kynan/nbstripout
    rev: 0.5.0
    hooks:
      - id: nbstripout

i'm a bit confused by this set up. but basically what i want to achieve:

  1. all .ipynb under ./notebooks/ should have an associated .py in py:percent format
  2. the code within each .ipynb and .py should match and be reformatted (isort, ruff, black) before commit

the problem is that i sometimes have to run pre-commit multiple times because either the .ipynb is modified then synced, or the .py is modified then synced... i think i'm a bit confused by my set up.

what is the right way to set this up?

mwouts commented 1 year ago

Hey @tgy , thank you for your kind words! Good to see you here!

One possible approach (the one that I use personally) is to exclude the .ipynb notebooks from version control, and to maintain the files in sync locally with a e.g. a jupytext.toml config in the notebook folder, with this content: formats = "ipynb,py:percent". The advantage of that approach is that the pre-commit hooks only apply to Python files, so they are a bit simpler to configure.

However you seem to have a preference for contributing the .ipynb notebooks to the project, and to maintain them in sync with the .py:percent files using the pre-commit hooks rather than a config file plus Jupyter. Have you tried to apply the reformatting tools and the pairing at the same time, using the --pipe command? We have a few pre-commit examples in the tests folder, maybe this one: test_pre_commit_3_sync_black_nbstripout.py is the most similar to what you want to achieve?

choucavalier commented 1 year ago

thanks! indeed it does not really make sense to version the .ipynb version alongside the .py version. i was trying to accommodate people who actually like coding in notebooks

thanks for the detailed answer, and congrats again for this awesome project!