jupyter / nbconvert

Jupyter Notebook Conversion
https://nbconvert.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
1.73k stars 564 forks source link

Offer api for more convenient in memory template definition &/or docs #673

Open mpacer opened 7 years ago

mpacer commented 7 years ago

After working through this process with @ian-r-rose today (& @choldgraf this summer), I've come to the conclusion that we should have a more convenient API for specifying in-memory templates.

In particular, I think the ideal API would involve the following steps:

  1. Defining a template string
  2. passing the template string to a TemplateExporter subclass.

And that's it!

That fully specifies the template the person wants and where they want it. So why should we be asking for more? Anything more complicated is offloading the burden of implementation on users. That excludes many of the people Jupyter is best poised to reach out to. (Aside: many thanks to @willingc & @yuvipanda for getting me to think along these lines!)

But before we can make that happen… we need to figure out where we stand.

Current situation: complicated, intricate, arbitrary steps

Right, now it takes multiple steps (at least 7 as laid out below), a good deal of understanding (knowledge of multiple traitlets, the traitlet types, an understanding that Jinja template loaders exist & how they work, &c.) and a number of arbitrary decisions about features in order to get an in-memory template working correctly in a custom exporter.

The full process will probably be worth documenting explicitly (I started this effort below). I'm not making a PR yet because I want more of an idea as to where this is going to go.

But in looking at this preliminary documentation, it becomes clear really quickly that this is an unnecessarily complicated set of choices for the simplest possible version of this task.

What's great though is I think we can reduce this complexity substantially. I'll give it a try tomorrow.

How to create and assign an in-memory template (as of 5.3.1)

  1. Create in-memory template text e.g.,

    my_template_text = "{%- extends 'rst.tpl' -%}"
  2. Decide on string for template name, a pseudo file-path (possibly with or without an extension)

    my_template_name_no_ext = "my_template_name"
    my_template_name =  my_template_name_no_ext + ".tpl"
  3. wrap template name and text in dictionary, the key is the name to the text

    my_template_dict  = {my_template_name: my_template_text}
  4. Explicitly load DictLoader from jinja2

    from jinja2 import DictLoader

    NB: Knowing to import this requires knowing:

    1. nbconvert uses jinja2 for its templates
    2. jinja2 uses template loaders to add templates to its environment
    3. nbconvert explicitly requires you to create the jinja2 environment by expressing its loaders.
    4. DictLoader exists.
    5. DictLoader is what you probably want for this case.
      1. DictLoader will create a pseudo filename in the environment so that it based on the keys in the dict passed.

    NB: Writing a template – especially if you're just taking existing templates and modifying them without taking the time to fully understanding them – does not require you to know any of those things. Given how complicated nbconvert is (even for me), I feel like the less we ask people to understand in order to use it, the more happy & capable everyone involved will feel and be.

  5. create a jinja loader instance that can load the template via jinja2.DictLoader

    my_template_loader = DictLoader(my_template_dict)
  6. Assign template name with or without extension to TemplateExporter.template_file This can be achieved in multiple ways.

    1. Assign as a class attribute directly (inspiration)
      class MyExporter(nbconvert.RSTExporter):
          template_file = my_template_name
          # or template_file = my_template_name_no_ext 

      I will note that I'm not going to include the option of my_template_name_no_ext anymore in these examples… but we should probably drive that home if it does come down to docs.

    2. as part of the __init__ (making it always part of MyExporter, where it doesn't need to be passed in) inspiration
      class MyExporter(nbconvert.RSTExporter):
             def __init__(self, *args, **kwargs):
                 super(MyExporter, self).__init__(template_file=my_template_name)
    3. as part of the constructor call inside of a script (i.e., without explicitly tying the exporter to your template file, it just happens to get it on this particular instance)
      #Either, directly
      MyExporter(template_file = my_template_name)
      # or as a config option
      MyExporter(config = {
             "TemplateExporter":{
                   "template_file": my_template_name}
             }
      })
  7. Add my_template_loader to the List TemplateExporter.extra_loaders This can be achieved in multiple ways (similar to 6 above…)

    1. Assign as a class attribute directly
      class MyExporter(nbconvert.RSTExporter):
          extra_loaders = [my_template_loader]
    2. as part of the __init__
      class MyExporter(nbconvert.RSTExporter):
             def __init__(self, *args, **kwargs):
                 super(Exporter, self).__init__(extra_loaders=[my_template_loader])
    3. as part of the constructor call inside of a script
      #Either, directly
      MyExporter(extra_loaders = [my_template_loader])
      # or as a config option
      MyExporter(config = {
             "TemplateExporter":{
                   "template_loader": [my_template_loader]}
             }
      })

cc @takluyver @Carreau

Carreau commented 7 years ago

Yes, the way to have an in-memory template is complicated, it was not a question of wanting to make that complicated, it was mostly fixing currently pressing issues the right way. The rest is incidental complexity of using multiple libraries.

And yes it would be nice to be able to make it simpler. It mostly a question of finding the time to write the APIs to do so.

takluyver commented 7 years ago

I agree that this should be simpler. I think the main challenge is designing an API that will interact nicely with all the many options template exporters have. I've got an idea about this - I'll think it through in a bit more detail and make a PR if I think it's going to work.

mpacer commented 7 years ago

I have an idea on how to implement it as well but go ahead and take a stab at it.

Could you say more about what the challenges are? Because i wasn't seeing too many challenges to the way I was thinking about… but I was limiting the case to "here pass in a raw string to this exporter attribute that specifies the whole template, now you are done" you can't usefully specify other details (template name, file extension, etc.) because for this use case they don't matter. What am I missing?

On Wed, Sep 13, 2017 at 08:20 Thomas Kluyver notifications@github.com wrote:

I agree that this should be simpler. I think the main challenge is designing an API that will interact nicely with all the many options template exporters have. I've got an idea about this - I'll think it through in a bit more detail and make a PR if I think it's going to work.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jupyter/nbconvert/issues/673#issuecomment-329202210, or mute the thread https://github.com/notifications/unsubscribe-auth/ACXg6CHjePnEZG1lvpy7p5nPlrt520o_ks5sh_KvgaJpZM4PVgzP .

takluyver commented 7 years ago

Stuff like:

I haven't got a specific example of a problematic case, but I know this code is already a bit tangled, and traitlets tend to make this kind of thing tricky, because they make it easy to casually mix up inputs and computed values.

mpacer commented 7 years ago

Ah I was going to make the assumption that that switching by default wouldn't exist and if someone wanted to implement it they could subclass. Otherwise specifying a raw_template would override any file based templates. I was also considering not making it a traitlet to avoid these issues

On Wed, Sep 13, 2017 at 10:36 Thomas Kluyver notifications@github.com wrote:

Stuff like:

  • If the user tries to switch an exporter instance from an in-memory template and a file template or vice versa? Can it get into some half-way state that won't work? This may not be good practice, but someone's going to want to do it, so we should either make it work or make sure it fails early with a clear error.
  • Does it work for both setting the template of an exporter instance, and defining a new subclass to use a template?

I haven't got a specific example of a problematic case, but I know this code is already a bit tangled, and traitlets tend to make this kind of thing tricky, because they make it easy to casually mix up inputs and computed values.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jupyter/nbconvert/issues/673#issuecomment-329241670, or mute the thread https://github.com/notifications/unsubscribe-auth/ACXg6CSCMyByI2ms60pp1i32smPVmKPDks5siBKBgaJpZM4PVgzP .