mwouts / jupytext

Jupyter Notebooks as Markdown Documents, Julia, Python or R scripts
https://jupytext.readthedocs.io
MIT License
6.65k stars 386 forks source link

custom format pairing #931

Open jonathan-taylor opened 2 years ago

jonathan-taylor commented 2 years ago

Wondering how hard it is to specify a new format pairing?

Can you point to code to specify, e.g. Rmd pairing?

jonathan-taylor commented 2 years ago

At a high level, I'd like to use jupytext to pair one or several executable docs (.ipynb) to one in which the emphasis is on text rather than on code. For example, for many, if they'd ever hope to write a directly publishable and executable paper they'd need the source to essentially be LaTeX (with all its quirks and users' years long encyclopedia of macros). Such a doc would look poor in a jupyter browser but fine to many who write papers in text editors. Paired with an nbconvert library call to maybe swap in the LaTeX version of outputs at build time might make close to a nice document.

So, I'm imaging something like a tex:py format in which code is appropriately escaped from the LaTeX but all the usual markdown cells of the .ipynb are just someone's quirky LaTeX code. The focus would be on having a view of the notebook that is customized to be built into nice PDF. Having an additional py:percent pairing would allow a user to develop the code in the document. Or even just opening it up in a jupyter tab one can work on the code.

mwouts commented 2 years ago

Hi @jonathan-taylor , I am afraid that writing a new format is quite hard, and even more defining a new format from scratch.

Can you point to code to specify, e.g. Rmd pairing?

The Rmd converters are defined at RMarkdownCellExporter and RMarkdownCellReader.

they'd need the source to essentially be LaTeX

So you'd like to encode a notebook in a .tex file? Something like the Sweave format from RStudio?

I am afraid this is not going to be very user friendly, I don't see that many people using TeX these days, and also I I think the people behind knitr have made their choice many years ago, in favor of the (R) Markdown format rather of R+TeX (= Sweave), because it is much more readable.

Such a doc would look poor in a jupyter browser but fine to many who write papers in text editors. Paired with an nbconvert library call to maybe swap in the LaTeX version of outputs at build time might make close to a nice document.

Well if you want the final document to look nice, then the rendering of the editable document in Jupyter could also look nice, don't you think so ? but maybe I am asking for too much :smile:

jonathan-taylor commented 2 years ago

Hmm... AFAIK papers that get published in stats journals are usually LaTeX, and many / most (?) theses are written in LaTeX (yes you can write it in e.g. R markdown but I'm not sure that's the typical path). Yes, people have supplements that are .ipynb but the writing itself is usually LaTeX.

For example, many of the "reproducible academic publications" here: https://github.com/jupyter/jupyter/wiki#reproducible-academic-publications are papers that point to a supplementary notebook. The papers, i.e. the official docs themselves are often LaTeX.

As for looking nice in Jupyter, formats like md:myst are not rendered by jupyter rather something like jupyterbook. Same family, but to get all the references etc. correct it needs more than just a jupyter tab.

My real interest in this is mostly a one-off: I am involved in a project with legacy LaTeX which works with code that must appear in the final PDF for which I'd like "source" to be executable going forward, not just buildable. A format that gives me both a .tex "view" on an .ipynb file as well as e.g. a ".py" view would be fine. Not looking for a build system (i.e. not trying to remake bookdown or jupyterbook or Sweave).

mwouts commented 2 years ago

Hi @jonathan-taylor ,

My real interest in this is mostly a one-off: I am involved in a project with legacy LaTeX which works with code that must appear in the final PDF for which I'd like "source" to be executable going forward

Oh that is interesting. Is there a pattern in our current project? I mean, how are "code cells" represented in these .tex files ?