Added boolean function argument for conditional execution

ezander commented 7 years ago

Sometimes I just want to include snippets of code from my notebooks, but not all the code in there, that generates plots, tables and so on. So, I added a boolean predicate function execute_cells to the interface, that can be used like this:

   nbinclude("AverageFitting.ipynb";` execute_cell=(cell,counter)->(counter∈[2,3,5]))

Of course, that could have been done differently (e.g. only specifying cell numbers, or splitting the function nbinclude into one for reading notebooks and one for executing notebook cells), but I thought this way it's quite flexible and requires the least amount of changes. What do you think?

stevengj commented 7 years ago

The problem is that this isn't very robust; if you edit the notebook, it is quite likely you would have to change the cell numbers.

I'd much prefer to do this via jupyter/notebook#601

ezander commented 7 years ago

Maybe. That's why I pass both the cell and the counter to the predicate function. If you want to use the notebook tags (assuming tags are finally implemented in jupyter) you could also pass something like (cell, counter)->("IncludeMe" in cell[:tags]), too. However, since tags are currently not implemented in the jupyter notebooks, you could now start with the counter, and later switch to something tag based. And, by the way, I think this whole nbinclude thing is not really about robustness. If I want to have something robust, I would turn the code in the notebook into a module. In my view, this is rather for some quick hacks. And one more thing: if you only decide via the tags what is included, then only the notebook author (who may not be identical to the one using nbinclude) has control over what get's included, while doing it the other way, the 'nbincluding' code has control, which is preferable IMHO.

stevengj commented 7 years ago

For quick hacks, a predicate function seems overly complicated. How about nbinclude(filename; cells=[1,3,7], counters=[2,4,5]), where it includes the cell if cell in cells || counter in counters?

ezander commented 7 years ago

The counter in counters thing is ok, but how is cell in cells supposed to work? I mean, cell is usually a dictionary and I don't see a way how you would get around a predicate function here.

The possibilities I currently see and their pros and cons are:

Only use the counter: nbinclude(filename; counters=...), easy, but very inflexible
Use counter counter and precidate on cell contents: nbinclude(filename; counters=..., cell_predicate), a bit easier than the proposed approach, but not much, and less flexible
Use counters, tags and maybe a regexp on the cell source: nbinclude(filename; counters=..., cell_tags=..., cell_regexp), maybe a bit easier to use?
Use a predicate on counter and cell as proposed...
Split parsing notebook content and executing it: e.g. cells = nbparse(filename); do_some_processing!(cells); nbexecute(cells), most flexible approach, but not very user-friendly. Could be combined, however, with one of the preceding approaches.

stevengj commented 7 years ago

By "cell", I was just thinking of the cell counter, as opposed to a sequential count of cells. But you actually want a predicate function on the cell contents?

I would be in favor of just using the counter, plus maybe a regex. For quick hacks, this is flexible enough, and as discussed above I only see this as something for quick hacks.

stevengj commented 7 years ago

If you want to do arbitrary processing on the cell contents yourself, I think you should just read in the JSON file directly.

ezander commented 7 years ago

What do you think about that? I'd be fine with that, too (without any predicate). I would even remove cellnums, I guess. The counters and regex should be enough.

ezander commented 7 years ago

Ok. Good point. I changed the default to 1:typemax(Int) for the counters, r"" for the regexp and removed the cellnums altogether (since they're useless in my view, anyway). Makes the check a lot easier.

habemus-papadum commented 7 years ago

Are you sure limiting to a regex is right? it is very hard to compose regex'es, but composing predicates is very easy to do and read.

stevengj commented 7 years ago

@lilinjn, if you are composing regexes or need complicated composed predicates, I think you need to re-organize your notebook...

habemus-papadum commented 7 years ago

No, that's not quite true.

My motivation is pedagogical, and I want to build notebooks step by step that on the one hand are fully self contained and functional, and on the other hand, allow various pieces to be resued, replaced or ignored by downstream components.

For instance, notebook1 might develop a dsl for modeling a physical system, and include a toy solver, and visualizations so students have something to hold onto. nb2 might include the dsl and solver from nb1, but ignore the visualization cells, and aim to teach creating a more performant solver. nb3 might include the solver from nb2, ignore benchmarking cells, but aim to create a better dsl (see footnote).

The point is that to build a complicated structure or machine, it's useful to find a flow that uses scaffolding and jigs to guide the process so that at any given point the global structure is visible even if not in full detail (as opposed to say a common flow in math texts that start out with a torrent of definitions, which also has its own merits)

footnote: more fundamental than predicate vs regex, is the actual mechanism to specify the user's preference -- nbinclude could allow easy customization on one hand, but also allow customizations at the toplevel nbinclude to override or augment the customizations of nested nbincludes, but the current solution does not allow for this. This gets complicated quickly, and might be more complicated than you would like to support, so no worries.

but any way that is ultimately what is driving my interest in this functionality. (note to self: learn to stop bristling at simple comments. One day, maybe) cheers, nehal

stevengj commented 7 years ago

Even in your example, it's not clear that you need more than a single #noinclude tag that you regex for. nb1 would use #noinclude on the visualization and nb2 would tag the benchmarks.

habemus-papadum commented 7 years ago

fair enough, this would be much more clear if I provided a concrete working example, but that leads to a bit of a chicken and egg dilemma. intuitively it feels clear to me that 1) organizing source text in chunks, and 2) modifying include to dynamically weave those cells together arbitrarily would lead to an immensely powerful approach to exploring (and teaching) difficult problems. A dynamic cweb if you will. Anyway at this point I have a clearer vision of what I am pursuing, so I very much appreciate your comments. cheers, nehal

ezander commented 7 years ago

Ok, I kept the &&, and the docstring now also says so. (BTW, I'm not quite happy with the formulation. Maybe you can say it in a better way?) And if someone wants ||, it's still possible to do something like

nbinclude('foo.ipynb', counters=1:3)
nbinclude('foo.ipynb', counters=4:typemax(Int), regex=r"#bar")

Not really nice, but it works...

JuliaInterop / NBInclude.jl

Added boolean function argument for conditional execution #2