Open owenlamont opened 7 months ago
Hey, we currently don't have support for linting / formatting Python code in markdown blocks (https://github.com/astral-sh/ruff/issues/8237, https://github.com/astral-sh/ruff/issues/3792). I'll close this in favor of the markdown issue for bookkeeping purposes as I think that should solve this but correct me if I'm wrong here.
Hmm, actually it would be a bit different as for markdown we wouldn't need to have the concatenated source code from all code blocks but if it's a notebook converted to markdown then I think it should have context from other code blocks? @owenlamont Do you think this is true?
Another solution currently that I can think of is to lint / format before converting it to markdown. I'm not sure how feasible this would be given my lack of knowledge about your setup.
Hi @dhruvmanila - yeah it would have to have the concatenated source code - I can see Ruff still tracks which code was in which Jupyter cell when raising warnings so if it could treat comment blocks exactly as Jupyter cells are treated that would be ideal.
As a work-around it could be exported to ipynb, linted and formatted, then re-exported to markdown - but that would be onerous. When working with Jupytext the notebook never gets persisted (in any permanent/visible way) as an ipynb - it gets loaded from Markdown and saved back to Markdown.
The ideal solution (from my perspective) would be to parse the YAML front matter of the Markdown, identify this as a Juptext generated Markdown, then recognise the code blocks need to be concatenated and treated as notebook cells. I totally understand though if this use case is too niche to justify the effort though. I can't speak much as to how many people use this format - as a repo jupytext is relatively popular (around 6k users - I recognise some relatively prominent Jupyter developers as contributors).
There's a similar request for quarto notebooks (#6140), and generally for Python code included in Markdown code blocks (#3792).
I think I'm looking at the same issue. We also use Myst Markdown notebooks in several projects and have been using the jupytext
ability to pipe code through black
to apply Python formatting:
$ jupytext --pipe black docs/source/users/pmodel/c3c4model.md
[jupytext] Reading docs/source/users/pmodel/c3c4model.md in format md
[jupytext] Executing black -
reformatted -
All done! ✨ 🍰 ✨
1 file reformatted.
[jupytext] Writing docs/source/users/pmodel/c3c4model.md in format md:myst
We have that as part of a pre-commit
hook to ensure that the code in our notebooks is properly formatted.
- repo: https://github.com/mwouts/jupytext
rev: v1.16.2
hooks:
- id: jupytext
args: [--pipe, black]
files: docs/source
additional_dependencies:
- black==24.4.2 # Matches hook
I haven't been able to work out exactly what happens, but the jupytext.cli
module provides a pipe_notebook
function that is used to round trip something (I think it must be just the code cell contents?) through black
.
OK - so jupytext
converts the notebook to percent
format, which is a python file with the markdown content stored as comments. black
can then run on the code alone and the format can be converted back.
I have a use case for Ruff and Ruff formatter that is a bit related to some of the other Markdown / Docstring feature requests but specifically I hoped to run Ruff and Ruff formatter on Jupyter notebooks that had been exported to markdown with Jupytext.
The company I'm at prefer converting notebooks to Markdown as it makes the notebook diffs much easier to read on Bitbucket (which doesn't support any notebook rendering/diffing like GitHub).
At first I noticed I could add markdown as a target file format for Ruff formatter and linter which got my hopes up that this would just work:
But when I ran Ruff I see it is failing to parse the markdown properly - I had hoped it would just run on the python comment code blocks in the same way it would parse Jupyter notebook cells and ignore all the other markdown content but its obviously trying to parse all the markdown, e.g.