File extension. - Githubissues

Carreau commented 9 years ago

Change to ipymd ?

Share some code with https://github.com/rossant/ipymd ?

jankatins commented 9 years ago

Re code sharing: From a quick look at the code of ipymd, there is almost no overlap due to the way this is implemented :-) ipmd seems to parse MD and deliver a model which is then returned to the notebook to be executed and displayed (@rossant: please correct me if I'm wrong here). knitpy on the other hand doesn't parse markdown at all (only uses some configurable "how to find the code chunks" and "how to embeded the results", which are markdown based per default but I think could actually be latex or rst based without much problems) and handles the execution and display itself.

Re common format: The main difference to notebook+nbconvert combo is, that it's much easier to hide the code in the resulting document ("per cell") and handle code in markdown (https://github.com/ipython/ipython/issues/2958). This "hiding of code" is included in knitpy as chunk options and ipmd and especially the notebook itself would need to be able to handle such things (e.g. a checkbox on each cell whether to hide or show code). Another difference is that the notebook and therefore ipmd includes the results/output of code, but the knitpy format only the code itself and no output. So it's not a common format with ipmd, but more with the notebook, as currently "what knitpy needs to put into the format" is incompatible to "what the notebook needs to put into the format", no matter if that's ipmd or ipynb.

Basically: knitpy = notebook + nbconvert + code-in-markdown + UI to individually hide code input + more UI to handle the formatting - output/results/plots in the file

If the notebook would change to add such things, it would actually be great to converge to a common markdown format, but I don't think they can be completely merged due to the "output included vs no included output" difference.

What makes sense is to use the ipmd file as a starting point to build a knitr file from it but for that one could also add a nbconvert template, which would probably be more user friendly :-)

rossant commented 9 years ago

Hi Jan

Cool project! (I'm ccing @odewahn who is likely to be interested too)

So, as far as ipymd is concerned:

I only use mainstream formats, like .md, .ipynb, .py, etc. There is no special markup to specify where the code chunks are. There is no .ipymd file, because it's just normal .md. It's a matter of choosing a convention and sticking to it. But everything is configurable by extending the code of course.
- For .md, I assume that a Python code block = an executable chunk of code. (only the Python language is currently supported)
- For .py, I assume that a block of comments = a block of Markdown, and the rest is made of Python code cells.
I also have support for the O'Reilly Atlas format (looks similar to your Markdown-based format).
The code is modular, and new formats can be added. I recently implemented support for ODT format to write and edit OpenDocument texts directly in the notebook, and to convert notebooks to ODT automatically. I also plan to support LaTeX, ReST, and HTML at some point.
Most importantly, conversion is fully bidirectional. So I can convert from any supported format to any other format: ipynb to md, md to py, odt to ipynb, etc. (This works by considering an intermediate "universal" format which is a simplified in-memory version of ipynb: just a list of code cells and Markdown cells.) This is why I needed a full Markdown parser. ipymd comes with a simple pure Python interface to write custom parser/lexer.
The converter can be hooked within the notebook for converting on-the-fly on the notebook server. That means that you can use the regular notebook interface on any supported format.
I'm planning to support images and plots, but I haven't done it yet. One possibility would be to automatically extract plots and save them to png files in an images/ subdirectory, and then replace the plot by Markdown link images.
There is no UI whatsoever in ipymd: I just use normal text editors and the notebook UI. What is really useful is the ability to edit the same .md file in both a text editor and the notebook opened at the same time. When you edit the notebook, the Markdown is automatically updated in the editor. I tend to prefer text editors for writing text, and the notebook for writing and testing code.

All in all I think there is room for collaboration and code sharing. I'm open to suggestions on this front!

Cyrille

jankatins commented 9 years ago

Hi Cyrille,

I would love to share some code but I currently have doubts whether this is possible because knitpy and ipymd have very different goals which lead to very different output/filecontent.

On the one hand you have ipymd as IMO a drop-in for the "filewriter" of the notebook and so ipymd needs to support the complete notebook story, including saving and loading resulting in the same UI state. Ipymd therefore has to have all information which is needed in the notebook UI in the markdown document, including the output of a code execution. The code execution happens in the notebook, in the rest of the file formats one can only edit the text (or manually add execution results)

The knitpy (and knitr) workflow on the other hand is optimised for "markdown" (currently markdown, but can be anything) + a way to embedded code. knitpy (and not the notebook) then executes the code and the resulting temporary markup document is then converted to teh final document format via pandoc. There is no possibility to go back from the temporary markup document to the original one, as in some case you can't distinguish between what's manually written and what's from the included code (see here for such a tmep md file: https://github.com/JanSchulz/knitpy/blob/master/examples/knitpy_overview.html_document.md).

The following code cell is such a case where it is hard to distinguish whether the output was produced by code or manually inserted by the author.

from IPython.core.display import Markdown
# This could also be a `Markdown(tabulate(df, list(df.columns), tablefmt="simple"))`
Markdown("**strongly formatted text***  and more", raw=True)

It outputs markdown as the result of an execution, which is rendered in the notebook UI. To be rendered in the md file, it has to be outside the code block. But outside the code block, it would be converted to a markdown cell when it is returned to the notebook ui. Similar problems exist for html (e.g. a pandas.DataFrame, which outputs html) or plots ("is that image produced by a codecell or manually included in markdown?").

I'm unsure how this problem of roundtripability can be solved in ipymd for other than "text/plain" results (which are included in code blocks). You probably need to put some information into markdown comments, if you want to keep it strictly "markdown-only". But then you also need something similar in odf and py, too...

If you want to write economics papers, where there are only tables and some numbers in the text and (almost) no code, the problem of hiding code has also be solved. This means to extend the feature set which is needed by the in the notebook UI, maybe by some "hiding code chunks" options/comments in markdown and by putting these options into cell metadata when returned to the notebook UI.

Inline code is another problem, which I don't see solvable in "markdown only": text {{1+1}} more text should output text 2 more text in the final document. You would write {{1+1}} in the markdown, the notebook would execute it and you would get 2 back, but then you can't reexecute it as the code is not anymore there. Not sure how one could both add the code and the result in such a markdown line.

Another difference (but this time purely "stylistic") is between how the notebook and how knitr/knitpy treats each a "cell/chunk": in knitr/knitpy, each "compileable line(s)" of a chunk are executed and not only the whole chunk (=cell). The following would result in two outputs in knitr, but only one in the notebook:

"A text"
"more text"

I think the holly grail on my side would be to get something like the Oreily atlas web UI (what is seen in these animation: http://odewahn.github.io/publishing-workflows-for-jupyter/#8 and http://odewahn.github.io/publishing-workflows-for-jupyter/#11) with the knitpy format as a backend :-)

Jan

rossant commented 9 years ago

I'd like ipymd to support one-directional converters, but these won't be available in the notebook. In your case I guess you could potentially do two things with ipymd:

Support your pymd format in the notebook: this should be quite easy. It's just a matter of converting pymd <=> ipynb.
Support pymd => md/html/whatever, but that won't work in the notebook (so just the convert() function)

For (2) it is possible that there's little code overlap between ipymd and your project.

Carreau commented 9 years ago

I'll read the all thread later, but first sorry for the confusion if there is too much difference, I didn't dug deep in both project and thought they might just be related.

jankatins commented 9 years ago

@Carreau no problem, I like to discover such things!

@rossant I actually thought about a converter, which takes a notebook and transforms it into a initial knitpy document (all code cells as visible codechunks, output will be discarded). Not sure if that's easier via nbconvert or ipymd.

jankatins / knitpy

File extension. #5