Closed ezander closed 7 years ago
The problem is that this isn't very robust; if you edit the notebook, it is quite likely you would have to change the cell numbers.
I'd much prefer to do this via jupyter/notebook#601
Maybe. That's why I pass both the cell
and the counter
to the predicate function. If you want to use the notebook tags (assuming tags are finally implemented in jupyter) you could also pass something like (cell, counter)->("IncludeMe" in cell[:tags])
, too.
However, since tags are currently not implemented in the jupyter notebooks, you could now start with the counter, and later switch to something tag based.
And, by the way, I think this whole nbinclude
thing is not really about robustness. If I want to have something robust, I would turn the code in the notebook into a module. In my view, this is rather for some quick hacks.
And one more thing: if you only decide via the tags what is included, then only the notebook author (who may not be identical to the one using nbinclude) has control over what get's included, while doing it the other way, the 'nbincluding' code has control, which is preferable IMHO.
For quick hacks, a predicate function seems overly complicated. How about nbinclude(filename; cells=[1,3,7], counters=[2,4,5])
, where it includes the cell if cell in cells || counter in counters
?
The counter in counters
thing is ok, but how is cell in cells
supposed to work? I mean, cell
is usually a dictionary and I don't see a way how you would get around a predicate function here.
The possibilities I currently see and their pros and cons are:
nbinclude(filename; counters=...)
, easy, but very inflexiblenbinclude(filename; counters=..., cell_predicate)
, a bit easier than the proposed approach, but not much, and less flexiblenbinclude(filename; counters=..., cell_tags=..., cell_regexp)
, maybe a bit easier to use?cells = nbparse(filename); do_some_processing!(cells); nbexecute(cells)
, most flexible approach, but not very user-friendly. Could be combined, however, with one of the preceding approaches.By "cell", I was just thinking of the cell counter, as opposed to a sequential count of cells. But you actually want a predicate function on the cell contents?
I would be in favor of just using the counter, plus maybe a regex. For quick hacks, this is flexible enough, and as discussed above I only see this as something for quick hacks.
If you want to do arbitrary processing on the cell contents yourself, I think you should just read in the JSON file directly.
What do you think about that? I'd be fine with that, too (without any predicate). I would even remove cellnums, I guess. The counters and regex should be enough.
Ok. Good point. I changed the default to 1:typemax(Int)
for the counters
, r""
for the regexp
and removed the cellnums
altogether (since they're useless in my view, anyway). Makes the check a lot easier.
Are you sure limiting to a regex is right? it is very hard to compose regex'es, but composing predicates is very easy to do and read.
@lilinjn, if you are composing regexes or need complicated composed predicates, I think you need to re-organize your notebook...
No, that's not quite true.
My motivation is pedagogical, and I want to build notebooks step by step that on the one hand are fully self contained and functional, and on the other hand, allow various pieces to be resued, replaced or ignored by downstream components.
For instance, notebook1 might develop a dsl for modeling a physical system, and include a toy solver, and visualizations so students have something to hold onto. nb2 might include the dsl and solver from nb1, but ignore the visualization cells, and aim to teach creating a more performant solver. nb3 might include the solver from nb2, ignore benchmarking cells, but aim to create a better dsl (see footnote).
The point is that to build a complicated structure or machine, it's useful to find a flow that uses scaffolding and jigs to guide the process so that at any given point the global structure is visible even if not in full detail (as opposed to say a common flow in math texts that start out with a torrent of definitions, which also has its own merits)
footnote: more fundamental than predicate vs regex, is the actual mechanism to specify the user's preference -- nbinclude could allow easy customization on one hand, but also allow customizations at the toplevel nbinclude to override or augment the customizations of nested nbincludes, but the current solution does not allow for this. This gets complicated quickly, and might be more complicated than you would like to support, so no worries.
but any way that is ultimately what is driving my interest in this functionality. (note to self: learn to stop bristling at simple comments. One day, maybe) cheers, nehal
Even in your example, it's not clear that you need more than a single #noinclude
tag that you regex for. nb1 would use #noinclude
on the visualization and nb2 would tag the benchmarks.
fair enough, this would be much more clear if I provided a concrete working example, but that leads to a bit of a chicken and egg dilemma. intuitively it feels clear to me that 1) organizing source text in chunks, and 2) modifying include
to dynamically weave those cells together arbitrarily would lead to an immensely powerful approach to exploring (and teaching) difficult problems. A dynamic cweb if you will. Anyway at this point I have a clearer vision of what I am pursuing, so I very much appreciate your comments. cheers, nehal
Ok, I kept the &&
, and the docstring now also says so. (BTW, I'm not quite happy with the formulation. Maybe you can say it in a better way?)
And if someone wants ||
, it's still possible to do something like
nbinclude('foo.ipynb', counters=1:3)
nbinclude('foo.ipynb', counters=4:typemax(Int), regex=r"#bar")
Not really nice, but it works...
Sometimes I just want to include snippets of code from my notebooks, but not all the code in there, that generates plots, tables and so on. So, I added a boolean predicate function
execute_cells
to the interface, that can be used like this:Of course, that could have been done differently (e.g. only specifying cell numbers, or splitting the function nbinclude into one for reading notebooks and one for executing notebook cells), but I thought this way it's quite flexible and requires the least amount of changes. What do you think?