executablebooks / sphinx-exercise

A Sphinx extension for producing exercise and solution directives.
https://ebp-sphinx-exercise.readthedocs.io
MIT License
19 stars 7 forks source link

Embed executable code inside of exercise solutions (Prototyping) #24

Open choldgraf opened 4 years ago

choldgraf commented 4 years ago

In #18 @jstac brought up a good point that it is quite common to want executable code in the solutions to your exercises. Right now this isn't possible with {code-cell} because this breaks the flat cell structure that Jupyter Notebooks enforce.

This is an issue to brainstorm how to get this working!

Some options

Use start/stop directives for solutions

As @chrisjsewell suggests, we could provide another directive here that would mark the start/stop of a solution. Something like:

```{solution-start}

Intro to solution

print(2+2)

## Improve cell copy/pasting functionality to make this easier

If we improved the cell copy/pasting functionality we could do something like:

`````md
% This code cell would be executed
```{code-cell}
:tags: remove-cell
:name: solution1
print(2+2)

% And here it'd be placed inside the solution block

```{ipynb:cell} solution1

Recommend jupyter-sphinx for solutions/exercises

As long as the code in the solution doesn't require the state of the kernel that came from running any previous code, then jupyter-sphinx should work just fine here. What won't work is if we define variables earlier in the page and then want to reference those variables in the solution.

What other ideas are out there?

jstac commented 4 years ago

Thanks for opening @choldgraf .

I would find option 1 easier to remember.

One related alternative would be to replace solution-start with solution-header and drop solution-stop.

Then the only think that's being inserted is a heading that tells us what exercise is being solved, while the contents of the solution is just ordinary notebook content.

I suppose the exercise could be symmetric, with exercise-header inserting "Exercise x", typeset nicely as determined by the extension, and the content of the exercise again being ordinary notebook content underneath the header.

najuzilu commented 4 years ago

These are great suggestions @choldgraf @jstac.

Just to double check, it wouldn't be possible to automate the gluing functionality so that whenever a code-cell is introduced in the content of the directive, it is replaced with a code block and the output of the code-cell - which can be added at the bottom of the document with a remove-cell tag - is glued in the directive?

The only issue I have with @jstac's solution-header suggestion is that we wouldn't be able to duplicate/copy solutions to another location as suggested in issue #11.

mmcky commented 4 years ago

One related alternative would be to replace solution-start with solution-header and drop solution-stop.

One downside here would be if you wanted to apply styling to the solution text (i.e. in similar fashion to an admonition). I think it would be useful to have a end demarcation so that the solution block is identified in the ast.

This is actually similar to the root issue behind https://github.com/executablebooks/MyST-NB/issues/274. The motivation behind this request is to be able to bring in code from a file (as used to be done with literalinclude) but jupytext (rightly) has no knowledge of anything outside of code-cell breaks.

I am in favour of option1 style approach to this -- mainly because the content stays ordered. If we introduce jupytext filters as a way to update content such as links to a website (instead of to other notebooks) to support download notebooks for websites then having notebooks with content in the correct order will make this work easier.

mmcky commented 4 years ago

Could we have an argument start and end -- rather than two different directives:

```{solution} start
```{solution} end ``` ```` I guess we should also think about other cases where nested `code-cell` may not work as we will want to adopt a consistent approach to solving this issue. The only other that comes to mind is perhaps `tabbed` blocks -- but that syntax would get pretty messy if we were to add something like this to `tabbed`
jstac commented 4 years ago

Thanks for your thoughts @mmcky .

As a user, I would be happy enough with the original option 1, although I do see your point about other nested code cells and the need for a consistent approach. Option 1 doesn't feel consistent with the rest of myst.

jstac commented 4 years ago

Hi guys, tossing out an alternative option: How about we use tabs? (as in https://jupyterbook.org/content/content-blocks.html#tabbed-content)

One tab is "Exercise X" and the other tab, with content initially hidden, is "Solution".

mmcky commented 4 years ago

hey @jstac this is a nice format option -- I like it. I think we will still get issues with executing the nested code-blocks (cc: @chrisjsewell) as it breaks the linear document jupyter notebook rule for the code-blocks though.

mmcky commented 4 years ago

I wonder if we could have tabbed code blocks in jupyter notebooks which had an execution rule of left to right. This would then directly translate to the notebooks format but this would essentially require a jupyter extension I think to change the base jupyter notebook format.

choldgraf commented 4 years ago

Just a note as well that we can always use jupyter-sphinx to execute and embed outputs in the built docs, and we can embed jupyter-sphinx execution inside of admonitions, tabs, etc. So you could have a MyST-NB notebook that also has jupyter-sphinx directives in it, and the {code-cell} blocks would just be executed separately from the jupyter-sphinx execution. The only challenge is that jupyter-sphinx won't have access to the same kernel that jupyter-cache uses when it executes the notebook. But this might still work for certain kinds of exercises.

mmcky commented 3 years ago

thanks @choldgraf -- that is a thought I had as well and might be the best we can do (unless jupyter-cache implements something like pre-execution filters/transforms). My concern though is the document will necessarily need more imports and it "looks" like it would flow through the document linearly -- when in fact execution is not. It also complicates the production of support download notebooks.

In sphinxcontrib-jupyter the code-block (nestedness) was ignored and a code-cell block would just be written following the exercise but we can't do that here as the action takes place in jupyter-cache first.

What do you think about a jupyter-cache transform that could look for any nested code-cells and promote them to be code-cells where they occur:

  1. If the the {code-cell} occurs in a markdown block it would (a) split the markdown block, (b) copy the contents of code-cell to a code-cell in the notebook json, (b) remove the {code-cell} from the text.

Given there is no real translation just a simple parse task -- this may have some merit. Note: The only issue I see with this is if the embedded code-cell is not at the end of the exercise block then we need to track exercise content across code-cells in the notebook for formatting.

mmcky commented 3 years ago

@choldgraf just to clarify one point on your suggestion. Would all the jupyter-sphinx directives be executed by the same independent kernel (at the file level)?

Perhaps we can introduce an {exercise-code} node that uses jupyter-sphinx under the hood

Then we can add support for hidden etc. so that you could have a preamble / import blocks for all exercises that doesn't show in outputs. We could also support future features like exercise/solution notebooks. From a user perspective I think it would be really confusing to be able to use jupyter-sphinx but not code-cell so perhaps exercise-code would make that relationship clearer at the expense of being a lightweight wrapper.

One con however is that these nodes require sphinx to be interpreted / executed -- so if we are to support download notebooks for websites (or binder / runnable notebooks) it will be difficult to have these nested items supported in any way in the notebook format.

choldgraf commented 3 years ago

I think exercise-code could be a good way to make it explicit. Your point about not having the same structure when you export to ipynb is precisely the challenge here, I think. Jupyter cache wants to have the exact same notebook in the cached and “source” version, which is why I believe @chrisjsewell has pushed back on requests to transform the content of a source file (like by in-nesting) in order to be executable as a notebook.

Longer term, I feel like we’re really going to need to come up with some kind of story for the non-linear (eg nested) execution workflow. It’s common and popular in R, and many services nowadays are supporting this kind of thing. I still wonder if we can accomplish this with improving the UX around glue-like functionality 🤔🤔🤔

mmcky commented 3 years ago

@choldgraf @chrisjsewell I am going to dive into this issue. I think using this as a prototype to figure out the best way to setup nested code will be useful case study to discuss the broader issue of nested code-cells in jupyterbook. Perhaps once a prototype is up and running we can all catchup on a technical workshop style meeting to see if we like it or can think of a better solution.

I will work with @chrisjsewell suggestion of using directives to indicate a start and stop while keeping code-cell at the root level for compatibility with myst and jupytext etc.

Such as:


```{solution-start} <exercise-name>

Content



The current `solution` directive includes support for: `label`, `class` and `hidden` attributes

cc: @jstac 
mmcky commented 3 years ago

@AakashGfude thanks for the brainstorming today. It looks like we could support styling only through the use of visit_solution_start and visit_solution_end nodes injecting updates to html and latex but it will be difficult to support features like hidden.

I think we will need to do some parsing of the doctree to support a full suite of features.

Perhaps we should use a transform to alter the tree (early on in the parsing) to build a parent-child relationship with nodes that are contained between solution-start and solution-end. But this transform would have to occur after md -> ipynb to be fully compliant with jupytext etc.

So we need to latch in at:

  1. md -> ipynb -> jupyter-cache
  2. read ipynb sources via (myst_nb)
  3. apply sphinx-exercise transform to make nodes between solution-start and solution-end children of a solution node for processing by sphinx -> html, latex
  4. parse as before extracting any code-cell outputs from jupyter-cache

@chrisjsewell @AakashGfude I think a transform is needed here to provide sphinx the relationship between the nodes. What do you think?

mmcky commented 3 years ago

Update: 27th October 2021.

I see two syntax options when building out tests for this.

Playing around the two syntax choices I think I am actually in favour of option 1 as I think it will be harder to miss a start and an end marker. I was initially thinking option 2 would be nice as you then only need to know one directive name but probably needs a lot more error checking.

Option 1:

```{solution-start} exercise-1
:label: solution-1

import numpy as np
from scipy.interpolate import splprep, splev

import matplotlib.pyplot as plt
from matplotlib.path import Path
from matplotlib.patches import PathPatch

N = 400
t = np.linspace(0, 2 * np.pi, N)
r = 0.5 + np.cos(t)
x, y = r * np.cos(t), r * np.sin(t)

fig, ax = plt.subplots()
ax.plot(x, y)
plt.show()

**Option 2:**
:label: solution-1
:start:

import numpy as np
from scipy.interpolate import splprep, splev

import matplotlib.pyplot as plt
from matplotlib.path import Path
from matplotlib.patches import PathPatch

N = 400
t = np.linspace(0, 2 * np.pi, N)
r = 0.5 + np.cos(t)
x, y = r * np.cos(t), r * np.sin(t)

fig, ax = plt.subplots()
ax.plot(x, y)
plt.show()
:end:
akhmerov commented 3 years ago

Just checking: I hope with this change in place, the original single directive will still stay, is this correct?

mmcky commented 3 years ago

Just checking: I hope with this change in place, the original single directive will still stay, is this correct?

thanks @akhmerov -- for sure -- not thinking we would remove the single directive exercise but rather support code-cell through a gated directive approach (as above) which can then map 1-to-1 through jupytext and get executed etc.

mmcky commented 2 years ago

I have now setup an working prototype of this idea for sphinx-exercise

https://github.com/executablebooks/sphinx-exercise/pull/45

I quite like the gated directive approach as it tends to simplify the syntax (when compared with nested structures) at the cost of searching from -start and -end components.