fastai / nbprocess-old

Process and export Jupyter Notebooks fast (Jupyter not required)
https://nbprocess.fast.ai/
Apache License 2.0
52 stars 12 forks source link

NBprocess blocking quarto flags #24

Closed Isaac-Flath closed 2 years ago

Isaac-Flath commented 2 years ago

@jph00 - We've chatted about the Quarto flags so tagging you. I have an editable install of nbprocess that's up to date, and same issue after updating Quarto to the latest version.

I created a simple notebook with an import, a function, and a test. The goal was to see if nbprocess_export and nbprocess_quarto worked. nbprocess_export worked.

nbprocess_quarto mostly worked, but with a bug. The documentation page creates beautifully, but the test should be hidden due to the include or echo flag. I tried them individually and together, none hid the cell. Assert statement should be hidden in this screenshot. There is also a thing in the header that says "AUTHORS" that I don't think needs to be there - this did not appear until after I upgraded to latest version of quarto.

Screen Shot 2022-04-14 at 7 37 24 PM

If I move the notebook outside of the github repository and just render it with quarto render, then the quarto flag does exclude the cell like I would expect. This makes me believe that something in nbprocess is blocking or stripping out these flags somehow. Here we can see the same notebook did hide the cell outside of nbprocess (but of course without nbprocess styling)

Screen Shot 2022-04-14 at 7 17 35 PM

Notebook referenced in the description: https://github.com/Isaac-Flath/quarto-debug/blob/main/nbs/01_core.ipynb This repo is forked from a minimal repo that Hamel shared. I will be chatting with him tomorrow, but documenting this issue here so you can both see it depending on who works on this. For convenience, here is the image of the notebook with the flags to hide the test.

Screen Shot 2022-04-14 at 7 25 21 PM
Isaac-Flath commented 2 years ago

@hamelsmu

I tested a variety of additional flags as requested.

Screen Shot 2022-04-14 at 8 48 52 PM
jph00 commented 2 years ago

Can you please make a list of which ones don't work outside of nbprocess so we can file a bug with quarto?

Isaac-Flath commented 2 years ago

Here are the ones I can't figure out how to get to work outside of nbprocess. Would you like me to file the issue on the quarto repo?

code-line-numbers

fig-align with matplotlib graph

#| fig-align: right
t = np.arange(0.0, 2.0, 0.01)
s = 1 + np.sin(2 * np.pi * t)

fig, ax = plt.subplots(figsize=(2,2))
ax.plot(t, s)

ax.set(xlabel='time (s)', ylabel='voltage (mV)',
       title='About as simple as it gets, folks')
ax.grid()

fig.savefig("test.png")
plt.show()

tbl-colwidths on pandas dataframe

#| tbl-colwidths: [5,5,100,5,5]

from sklearn import datasets
iris = datasets.load_iris()
import pandas as pd
pd.DataFrame(data= np.c_[iris['data'], iris['target']],
                     columns= iris['feature_names'] + ['target'])
jph00 commented 2 years ago

Awesome thanks! I'll forward this directly via email to quarto gurus.

jjallaire commented 2 years ago

Hi @Isaac-Flath thanks for these reports! Here is what I've discovered:

(1) For code-line-numbers unfortunately those additional options work only for revealjs output (however, our docs are not clear on this, I will update them).

(2) For tbl-colwidths we did indeed have a bug there for handling that in Jupyter. Even with this fix though you need to modify the code to produce a markdown table (as we are only able to affect the widths of things in the Pandoc AST, which in turn need to be markdown rather than HTML). Here's modified code the should work as you expect once you have our updated version (https://github.com/quarto-dev/quarto-cli/releases/tag/v0.9.257):

```{python}
#| tbl-colwidths: [5,5,100,5,5]

from IPython.display import Markdown
from sklearn import datasets
iris = datasets.load_iris()
import pandas as pd
import numpy as np
df = pd.DataFrame(data= np.c_[iris['data'], iris['target']],
             columns= iris['feature_names'] + ['target'])
Markdown(df.to_markdown())


Note that you need the **tabulate** Python package installed to create markdown output from Pandas.

(3) The `fig-align` attribute is working correctly for me in HTML output using the code you provided above:

![Screen Shot 2022-04-15 at 12 29 47 PM](https://user-images.githubusercontent.com/104391/163596283-9195f8c0-8ed5-498b-adfc-9568ffbe1aa6.png)

Were you working in HTML or another format? I am seeing it **not** work in PDF and will investigate that now.
jjallaire commented 2 years ago

Okay, I now see the problem with fig-align. The fig-align attribute was only getting included if there were other image attributes coincident with it. In my test .qmd Quarto is rendering the cells using matplotlib "retina" graphics format, which ends up including an explicit width attribute (and therefore picks up fig-align). I'm guessing you rendered in the notebook directly so no attributes were present.

This should be fixed in our next release (later on today)

hamelsmu commented 2 years ago

@jjallaire Do you know whats causing the "Authors" heading to show up?

image

I'm seeing that if I try to convert a notebook to md, I see this being injected into the front matter. However we are not doing this, is this something that could be happening on your end? You can reproduce this from this repo https://github.com/fastai/quarto-debug/tree/main

---
labels:
  abstract: Abstract
  affilations: Affiliations
  affiliations: Affilliations
  authors: Authors
  description: Description
  doi: Doi
  published: Published
pagetitle: index
quarto-version: 0.9.250
title: Index Page
toc-title: Table of contents
hamelsmu commented 2 years ago

@jjallaire as an update to the "Author" thing, It seems to happen if you have a raw cell with front matter even without any notebook filter at all. So there seems to be a situation where if front matter is already there in a notebook (whether its put there manually or by a filter), quarto will inject the authors: Authors metadata into it, which has the unfortunate side effect of rendering that onto the page

Hope that helps

hamelsmu commented 2 years ago

@Isaac-Flath @jph00

Regarding nbprocess not respecting quarto directives, this is actually the fault of nbprocess at the moment because we are removing all directives (anything with #|... ) upstream before they have a chance to get to quarto.

There could be different ways for us to handle this, but we should discuss this amongst ourselves. We should probably only remove directives that are valid?

jjallaire commented 2 years ago

The branch that landed yesterday w/ all of the title block improvements caused both of the issues reported here ("authors" and the extra metadata). These should be fixed later on today.

hamelsmu commented 2 years ago

The branch that landed yesterday w/ all of the title block improvements caused both of the issues reported here ("authors" and the extra metadata). These should be fixed later on today.

Wow this is amazing you are always so fast and helpful. Thanks so much for your help. I can't even pay for tools with this level of support ❤️ 😄

Isaac-Flath commented 2 years ago

Gotcha! I am also blown away with how fast you are responding and fixing things. Thank you so much for this feedback and help JJ.

tbl-colwidths : I saw that some tbl flags worked with pandas tables and made the assumption that meant the all the tbl flags would work with pandas tables. Thinking about it more after the details you provided - the ones that were working were all ones that would not have to modify the table itself, such as fig-link, fig-subcap, and fig-cap so it makes perfect sense now.

Isaac-Flath commented 2 years ago

@hamelsmu

Regarding nbprocess not respecting quarto directives, this is actually the fault of nbprocess at the moment because we are removing all directives (anything with #|... ) upstream before they have a chance to get to quarto.

There could be different ways for us to handle this, but we should discuss this amongst ourselves. We should probably only remove directives that are valid?

For building the docs it feels like we need to just strip out the nbprocess specific flags and leave the rest (ie #| export).

The other option I could think of is moving the nbdev flags into a quarto defined place (such as #| tags: export or #| classes: export) so then for docs none of the flags needs to be stripped and it can go more directly to Quarto for rendering.

Updated: On quick test it seems that giving Quarto an unrecognized flag to render causes issues with rendering in some scenarios with multiple flags

Isaac-Flath commented 2 years ago

I did some more testing on why sometimes Quarto is fine with bad flags and just strips them out, and why sometimes I would get an error.

This does not cause a rendering error

#| export:
#| code-collapse:true

This does cause a rendering error, and it seems to think that code-collapse is a continuation of export. (no : after export)

#| export
#| code-collapse:true

So the : seems to tell it that any value associated with it would be on that line. Based on this I think that another possible solution is to have nbprocess flags use the :,then I think they wouldn't need to be stripped out and could be passed directly to quarto for docs rendering.

While #| export: would be possible, all flags could also be key values such as #| nb: export or #| nb: exporti. I don't like that it's more typing for the user, but I do like how clear it is that these are nbprocess flags and not quarto flags and that it seems more consistent with quarto syntax of key:value.

Disclaimer: I don't know much about how Quarto renders to know if this is a viable/smart strategy based on how it works. Not sure if this is a coincidental thing in quarto, or something baked in that we can rely on so that'd be something to verify.

jjallaire commented 2 years ago

Yes, we expect that anything trailing #| is valid YAML so #| export would definitely cause a problem.

You could just strip lines that don't include a colon or strip lines that are known nb directives?

jph00 commented 2 years ago

Many apologies @jjallaire we had an internal miscommunication - I've actually already fixed that nbprocess flags problem @Isaac-Flath mentioned :)