jupyter / notebook

Jupyter Interactive Notebook
https://jupyter-notebook.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
11.56k stars 4.84k forks source link

Exporting a single jupyter cell output #3039

Open toqduj opened 6 years ago

toqduj commented 6 years ago

I checked the current issues and could not find a similar request:

We would very much like to be able to export the output from a highlighted cell to PDF or PNG or anything, really. Apparently this is possible in Mathematica, but not yet in Jupyter Notebook.

The background comes from this: We're able to style Pandas Dataframes to show them just the way we need for our certification reports. However, after that, there is no "pyplot.savefig"-like function to store the output thereof in a format that would allow us to include it in other documents.

Googling for solutions to such issues brings up a host of stackOverflow semi-answers that offer only a very convoluted way to recreate / rebuild the table somewhere else (using plotly or matplotlib tables). That's a big waste of the effort put into the pandas styler.

Since Jupyter already allows you to export the entire notebook, is it possible to implement this on a per-cell basis as well, preferably only the cell output?

jcb91 commented 6 years ago

To clarify a little, what actually is the output you're seeing? I guess whatever the pyplot renderer provides? Could you provide an example notebook (e.g. as a gist) that produces the kind of output you'd like to be able to save?

dsblank commented 6 years ago

Not sure you could have a general "save-the-output-of-the-cell-to-a-file" magic, but specifically you can save any matplotlib plot to a file like this:

        plt.savefig("figure.png")

-Doug

On Sat, Nov 11, 2017 at 8:10 AM, Brian R. Pauw notifications@github.com wrote:

I checked the current issues and could not find a similar request:

We would very much like to be able to export the output from a highlighted cell to PDF or PNG or anything, really. Apparently this is possible in Mathematica, but not yet in Jupyter Notebook.

The background comes from this: We're able to style Pandas Dataframes to show them just the way we need for our certification reports. However, after that, there is no "pyplot.savefig"-like function to store the output thereof in a format that would allow us to include it in other documents.

Googling for solutions to such issues brings up a host of stackOverflow semi-answers that offer only a very convoluted way to recreate / rebuild the table somewhere else (using plotly or matplotlib tables). That's a big waste of the effort put into the pandas styler.

Since Jupyter already allows you to export the entire notebook, is it possible to implement this on a per-cell basis as well, preferably only the cell output?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/jupyter/notebook/issues/3039, or mute the thread https://github.com/notifications/unsubscribe-auth/AAKSeM3ZYLGJPQFa0fXr2jmbSbpr0UtNks5s1Zy7gaJpZM4QafFC .

takluyver commented 6 years ago

You should be able to do something like this:

data, metadata = get_ipython().display_formatter.format(obj)
with open('table.html', 'w') as f:
    f.write(data['text/html'])  # Assuming the object has an HTML representation

That could be wrapped up in a little utility function.

For a pandas DataFrame, it's going to wind up calling df.to_html() - you can call that directly to have more control over the options, but the snippet above should work for anything that defines an HTML repesentation.

toqduj commented 6 years ago

Good point. Hang on while I fish out an example... Take, for example this dataFrame with styled formatting:

inputDict = {
    "$D_n$" : [np.pi, np.sqrt(np.pi), 2 * np.sqrt(np.pi)],
    "$\sigma_n$" : 2*[np.pi, np.sqrt(np.pi), 2 * np.sqrt(np.pi)],
    "$\phi$" : 3*[np.pi, np.sqrt(np.pi), 2 * np.sqrt(np.pi)]
}

HStest = pandas.DataFrame() # we populate the HStest DataFrame with the items in the tuple.

for key, item in inputDict.items():
    indata = {
        "parameter" : key,
        "mean" : item[0],
        "standard deviation" : item[1],
        "expanded standard deviation" : item[2],
        "relative expanded standard deviation" : item[2] / item[0]
    }
    HStest = HStest.append(pandas.DataFrame(data = indata, index = [0]), ignore_index = True)

HStest.set_index("parameter", inplace = True) 

hs = HStest.loc[:,[  # show the content in this order:
    "mean", 
    "standard deviation", 
    "expanded standard deviation", 
    "relative expanded standard deviation"]].style.format(
    {"mean": '{:.2e}', # use this style for these columns:
     "standard deviation": '{:.2e}', 
     "expanded standard deviation": '{:.2e}', 
     "relative expanded standard deviation": "{:.2%}"})

If I display hs, I see the following nicely formatted table: screen shot 2017-11-14 at 15 11 19

However, when I export the HTML code that comes out of hs.render(), which is what looks pretty in jupyter notebook, and load that in a separate tab, I only see the following:

screen shot 2017-11-14 at 15 13 32

Gone is the nice style sheet, the rendering of the LaTex labels, etc, I only see a shit table with some cruft at the top. There seems to be no way of exporting the look of the table in the JP Notebook together with the table.

takluyver commented 6 years ago

The notebook has some default CSS that applies to your output - the bits relating to tables are here: https://github.com/jupyter/notebook/blob/fb4af909a1b5c44e2dad129a5d4834b33f9e6a4b/notebook/static/notebook/less/renderedhtml.less#L77

The Latex labels are rendered by a Javascript library called Mathjax.

You could probably put the HTML output into nbconvert's HTML template, which loads the notebook's CSS and the Mathjax library, to make something like the output in the notebook.

toqduj commented 6 years ago

@takluyver So what you're saying is: "yes, there is no easy way to export the output of a single cell" :).. However, what you mentioned might be implemented as a strategy to do just that if nbconvert is modified to allow the specification of output cells?

takluyver commented 6 years ago

Yup. I don't think it even needs much modification of nbconvert: you could probably feed it a notebook with one cell which has one output, and use some recently added options to tell it to hide the input part of the cell.

mpacer commented 6 years ago

In theory one could use tags as an inverse from the current tag removal rules and only include those cells that are tagged. This would mean a new preprocessor, but that's not too much.

@toqduj if you're interested, I'm going to be on vacation for a while, but you can look at https://github.com/jupyter/nbconvert/blob/master/nbconvert/preprocessors/tagremove.py to get an idea of what would need to be implemented.

Then it would be a matter of tagging the cells you want and exporting using an appropriate config file.

If you want multiple individual cells exported as separate files, you could also make it more complicated and instead of doing it as a single pass operation, create a separate document for each of your tags that you specify as being included. If you wanted only individual cells for this model that would require using a unique tag for each… but that's not impossible. This would probably be implemented as a separate exporter and not included in nbconvert core (though I could be persuaded otherwise). I would use the model we have for handling external output files in the notebook by writing a zip. That would mean that it would work with the eventual nbconvert service that I'm working on (which communicates RESTfully so it can only return a single file).

There's also another approach (that works but isn't general enough for nbconvert AFAICT) where you use one tag, and each cell with that tag is exported to a separate document. This would be implemented at the exporter level.

mortcanty commented 6 years ago

I'm not very familiar with nbconvert but I am writing a textbook in LaTeX which makes many references to individual cells in Jupyter notebooks. I'm resorting to screenshots converted to eps to integrate the cells into the text, not really acceptable. The ability to export a highlighted cell (both input and output) as png or eps would be fantastic. Or any trick with nbconvert which can accomplish something similar?

takluyver commented 6 years ago

At the moment, probably the most practical approach to that situation is to use nbconvert to convert the whole notebook to latex - nbconvert --to latex notebook.ipynb - and then open the result in your editor, and copy the pieces you need into your document.

mortcanty commented 6 years ago

Thanks for the tip. Unfortunately I can't seem to get it to work with the publisher's document style.

Am 13.12.2017 um 17:42 schrieb Thomas Kluyver:

At the moment, probably the most practical approach to that solution is to use nbconvert to convert the whole notebook to latex - |nbconvert --to latex notebook.ipynb| - and then open the result in your editor, and copy the pieces you need into your document.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jupyter/notebook/issues/3039#issuecomment-351448674, or mute the thread https://github.com/notifications/unsubscribe-auth/AGAHpjQL5yVRki1Bw8t-O4g62IOEhpa7ks5s__5ugaJpZM4QafFC.

toqduj commented 6 years ago

Screenshots and manual edits of jupyter notebooks are not really suitable for our purpose: the production of reference material certification documents with easily updated graphics for the statistical evaluation.

One solution that could make sense is that by @mpacer. While it's still not easy to do automatically, at least the graphics won't shift from one to the other, and the nice formatting of the styler wouldn't be lost...

matanox commented 6 years ago

I humbly think that programmatically exporting pieces of a notebook to HTML (maybe other formats as well don't know) can be extra helpful, for making data available to peers that do not care about code or intermediary results on some cells. Ideally, exporting to a viewer that has different tabs, because people not used to Jupyter aren't comfortable with one long scroll. How to avoid developing too much of this to keep this feature suggestion healthy is more of an art, but this can help in regularly disseminating notebook generated data without all the fuss.

Happy to hear what other people think, and/or existing ways this can be accomplished..

psychemedia commented 5 years ago

I started looking at ways of grabbing formatted pandas HTML tables into a png here. It's not very convenient though — it requires selenium automation, for a start.

import os
import time
from selenium import webdriver

#Via https://stackoverflow.com/a/52572919/454773
def setup_screenshot(driver,path):
    ''' Grab screenshot of browser rendered HTML.
        Ensure the browser is sized to display all the HTML content. '''
    # Ref: https://stackoverflow.com/a/52572919/
    original_size = driver.get_window_size()
    required_width = driver.execute_script('return document.body.parentNode.scrollWidth')
    required_height = driver.execute_script('return document.body.parentNode.scrollHeight')
    driver.set_window_size(required_width, required_height)
    # driver.save_screenshot(path)  # has scrollbar
    driver.find_element_by_tag_name('body').screenshot(path)  # avoids scrollbar
    driver.set_window_size(original_size['width'], original_size['height'])

def getTableImage(url, fn='dummy_table', basepath='.', path='.', delay=5, height=420, width=800):
    ''' Render HTML file in browser and grab a screenshot. '''
    browser = webdriver.Chrome()

    browser.get(url)
    #Give the html some time to load
    time.sleep(delay)
    imgpath='{}/{}.png'.format(path,fn)
    imgfn = '{}/{}'.format(basepath, imgpath)
    imgfile = '{}/{}'.format(os.getcwd(),imgfn)

    setup_screenshot(browser,imgfile)
    browser.quit()
    os.remove(imgfile.replace('.png','.html'))
    #print(imgfn)
    return imgpath

def getTablePNG(tablehtml, basepath='.', path='testpng', fnstub='testhtml'):
    ''' Save HTML table as: {basepath}/{path}/{fnstub}.png '''
    if not os.path.exists(path):
        os.makedirs('{}/{}'.format(basepath, path))
    fn='{cwd}/{basepath}/{path}/{fn}.html'.format(cwd=os.getcwd(), basepath=basepath, path=path,fn=fnstub)
    tmpurl='file://{fn}'.format(fn=fn)
    with open(fn, 'w') as out:
        out.write(tablehtml)
    return getTableImage(tmpurl, fnstub, basepath, path)

#call as: getTablePNG(s)
#where s is a string containing html, eg s = df.style.render()

However, I did start looking around for lighter Javascript only solutions (fragmentary notes) which might provide a better starting point? For example:

wesinator commented 5 years ago

Would also be useful to have a menu item to export cell code to a separate Python file. So users don't have to manually copy and paste into a text editor

Should this be filed as a new issue ?

psychemedia commented 5 years ago

In passing, I note this repo that appeared recently, but I've not had a chance to see what it actually does yet...

als0052 commented 4 years ago

Just as a note for anyone who finds this after Jan. 2020, the Pandas API has added new features in version 1.0.0 to export to markdown tables. I haven't tried it but it sounds like it could be a workaround to OP's application.

That being said, I'd also like the ability to select a single cell in a notebook and export it as .tex. Currently I'm combing through a full exported .tex notebook to pick out the cells I want. At the very least it'd be nice to export an in cell and out cell as an image.

TomNicholas commented 4 years ago

I also think that easily saving the output of a single cell (as png/html) would be a useful feature. I think some people here are suggesting a new cell magic, while others are talking about a notebook button or right-click option?

As you said @takluyver you can do the former with nbconvert, but is the best way really to run nbconvert on the entire current notebook from within the notebook? This requires getting the name of the current notebook (already a bit of a hack apparently), and then running nbconvert on that file, but it seems circuitous. Also would there be problems if the notebook hasn't been saved? And what if the notebook is password-protected?

Might there be a better way to access the cell output using the %%capture cell magic or something? Then wrap it up to make a new %%save_output cell magic maybe?

I can use %%capture to save the png output of my cell in a rudimentary way, but that also suppresses the output of the cell, which I don't want to do, and you can only access the output in a later cell:

[1]: # setup
...  %matplotlib inline
...  import xarray as xr
...  ds = xr.tutorial.open_dataset('air_temperature')
...  da = ds['air']
...  da = da.sel(time='2014-12-31T18:00:00')

[2]: %%capture out
...  da.plot()

[3]: f = open('./output', 'wb')
...  f.write(out.outputs[1].data['image/png'])
...  f.close()
jpjpjp commented 3 years ago

I read this whole thread...and ended up taking a screenshot and using that as the PNG. Won't work when there is more than a screenful of output, but....

rrtucci commented 2 years ago

I read this whole thread...and ended up taking a screenshot and using that as the PNG. Won't work when there is more than a screenful of output, but....

Even though you are half joking, IMHO there is much wisdom in your statement. It is possible to select a cell in jupyter notebooks. Why can't one extend the print screen software so that it takes a png snapshot of just the selected cell? This would include a picture of the widgets in their current state and of interactive plots

adrijanik commented 1 month ago

Just found this thread now when I search for a way to copy to clipboard button for the cell output or save-to-file button for the output. Although I can simply highlight all the text and copy it to clipboard, a useful feature for me would be: save-output-of-the-cell-after-it-finished-running. I have a very long output and there are some times that I just don't think I need everything saved or simply forget about saving because I'm just experimenting and then the notebook is running for an hour and I see the outputs are afterall interesting and I want to just save it to file with a single click from within jupyter notebook. It could be just a small icon that appears upon hoovering over top-right corner of a cell output or code which gives an option to save to clipboard and save to file and/or an option in Current Outputs next to toggle and clear (see below screenshot with my clunky drawings)

image