tskit-dev / tskit

Population-scale genomics
MIT License
151 stars 70 forks source link

Define _repr_pretty_ for ts+tables & _repr_svg_ for ts+trees #1268

Open hyanwong opened 3 years ago

hyanwong commented 3 years ago

It seems like Jupyter notebooks allow a whole set of _repr_XXX_ methods, which, when display() is called, are tried in order depending on the abilities of the notebook. E.g.

# Quick hack to show _repr_XXX_ methods in a notebook
tskit.TreeSequence._repr_svg_ = tskit.TreeSequence.draw_svg
tskit.TreeSequence._repr_pretty_ = lambda self, p, cycle : p.text(str(self))
display(ts)  # Normal html view
display(ts, exclude=["text/html"])  # alternative SVG view
display(ts, exclude=["image/svg+xml", "text/html"])  # Will show the str() view

It seems sensible to provide _repr_pretty_ for tree sequences, tables, and trees, and _repr_svg_ for tree sequences and trees.

Actually, there's a question here about whether the "pretty" textual representation of a tree sequence should be the draw_text() method (showing the trees) or the __str__() method (showing a tabular view).

jeromekelleher commented 3 years ago

Totally, we should implement these.

One thing to think about is that these should work well on massive tree sequences as well as small example ones, so drawing isn't something that should be done unless it's asked for I think.

hyanwong commented 3 years ago

One thing to think about is that these should work well on massive tree sequences as well as small example ones, so drawing isn't something that should be done unless it's asked for I think.

Yeah, that's a really good point. I was half wondering if display(ts) should actually plot the svg, while display(ts.tables) should show what we currently have as the html tabular view, but that argues strongly against it.

I don't know if that would also apply to a tree: what would a user expect if they did display(ts.first()) in a notebook? I think I would expect a graphical tree to be drawn. OTOH, that's inconsistent with display(ts) which shows the tabular form. So I don't know what's best here?

It would be trivial to define

class TreeSequence:
    def _repr_svg_(self, **kwargs):
        return self.draw_svg(path=None, **kwargs)

and similarly for Trees, but would that conflict with your point above about explicitly asking for the SVG form rather than magically letting IPython/Jupyter do the work.

benjeffery commented 3 years ago

Am I right in thinking _repr_svg_ is only relevant for the Qt console? Most users will be using notebooks in browser.

hyanwong commented 3 years ago

It works for me in the browser. Try the code below in a browser window:

import tskit
import msprime
tskit.TreeSequence._repr_svg_ = tskit.TreeSequence.draw_svg
ts = msprime.simulate(10)
display(ts, exclude=["text/html"])  # alternative SVG view
benjeffery commented 3 years ago

Yes, by relevant I meant that it will only be used by default on the qt console.

hyanwong commented 3 years ago

Oh, you mean the qt console doesn't render html? Possibly.

If we add _repr_svg_ to a tree, this would turn into the default representation used by display() even in a browser notebook, because (I think) there's no Tree._repr_html_() method.

jeromekelleher commented 3 years ago

If we add _reprsvg to a tree, this would turn into the default representation used by display() even in a browser notebook, because (I think) there's no Tree._reprhtml() method.

We wouldn't want this though - same argument as above. If I accidentally typed tree at the end of a code cell it shouldn't crash my browser.

A repr html for the tree would be nice - some summaries mabye?

hyanwong commented 3 years ago

Oh, you mean the qt console doesn't render html? Possibly.

So are we worried that providing _repr_svg_ might risk overloading a QT notebook if it was run and tried to display a large tree sequence? There's a reasonable argument to say that if we try to run draw_svg() on a large enough tree sequence, we should throw an "are you sure" type warning?

jeromekelleher commented 3 years ago

An I right in thinking that SVG(tree) would call _repr_svg_? Then it seems like a clear win for me. We should define repr_html too, so that this doesn't get called on display(tree)

hyanwong commented 3 years ago

An I right in thinking that SVG(tree) would call _repr_svg_? Then it seems like a clear win for me. We should define repr_html too, so that this doesn't get called on display(tree)

I think you need to import and use display_svg() not SVG(). But I think display_svg(my_tree) is still marginally easier to grok than display(SVG(my_tree)), which argues for providing the _repr_svg_ method.

jeromekelleher commented 3 years ago

Yes, display_svg is better all right. I don't see any harm in having _repr_svg so long as it's not the default.

hyanwong commented 3 years ago

I wonder if it's still worth by default raising an error if the user tries to draw a tree or ts with (say) > 1 million edges? Presumably this would need some way to override it if that's what was really intended.

Maybe not worth the hassle, but might be a polite thing to do?

jeromekelleher commented 3 years ago

I don't think it's worth the hassle - it's a mistake you make once.