Closed cbrnr closed 2 years ago
personally what I don't like is that it hides the name the key to get the values in info
No strong feeling from me. With the HTML we might be able to get away with more information even if we try to make it more consistent, for example we could probably do a 3-column table that has key / description / value (currently it's description / value), and keep the console more compact and closer to code with just key / value (what it is now).
I am just +0.5 on add a key column to html repr
I propose:
key
column as @larsoner suggests. Personally I'd but it in between the description and the value, possibly even as a parenthetical like "Highpass (highpass
)" or "Sampling frequency (sfreq
)"0 magnetometer
good channels
to be \n
separated instead of comma-separated (I think this was recently done by @larsoner for some other field in one of the other __html_repr__
methods? why not here too). If this is done, we should drop the words "and" and "channels" tooworks for me
Ideally, we should only have to maintain __repr__
and just put its contents in a nice HTML table. I don't like that currently they contain different fields. I'm not even sure we need a description column for the fields when we could just use the keys (they should be self-explanatory and if not they are documented).
Maybe just something like this:
def _repr_html_(self, caption=None):
"""Summarize info for HTML representation."""
html = ''
if isinstance(caption, str):
html += f'<h4>{caption}</h4>\n'
html += '<table class="table table-hover table-striped table-sm table-responsive small">\n' # noqa
for row in repr(self).split('\n')[1:-1]:
key, value = row.split(':')
html += '<tr>\n'
html += f'<th>{key.strip()}</th>\n'
html += f'<td>{value.strip()}</td>\n'
html += '</table>'
return html
Which looks like:
BTW, the same applies to Raw
and Epochs
. I think that the two reprs should contain the same pieces of information.
@cbrnr In my current PR #9828 on adding ICA to Report, I also reworked ICA.__repr__
and added ICA._repr_html_
. To reduce code duplication and improve consistency between the two, I took the following approach:
https://github.com/mne-tools/mne-python/blob/558c88abc326db69a9832dcb9dfcb7735d7f06b5/mne/preprocessing/ica.py#L446-L540
I'm -1 on adding a key
column as this will force us to do a 1-to-1 mapping between rows and keys or attributes, which I think is not practical:
_html_repr_
for many more objects than just Info
Fit
row, I actually combine 3 fields, so it's impossible to have a 1-to-1 mapping:Also I'm adding info that's just not present in the object itself (i.e., is calculated), namely Explained variance
I want to add that my concern re 1-to-1 mapping can be disregarded if we're only talking about Info
.
I find it very strange to show different pieces of information depending on which backend is used. What exactly is the reason why showing exactly the same information is not feasible? The HTML repr is also just text which should easily fit in a terminal. This should not just synced for Info
, but also for Raw
, Epochs
, and ICA
(I don't think we have more).
What exactly is the reason why showing exactly the same information is not feasible?
HTML repr allows us to present data in true tables (that we will make collapsible soon, too), while you'll typically want a very concise repr on the terminal. Data can also be styled beyond just a tabular alignment in HTML, while this stuff isn't easy without introducing new dependencies on the terminal (e.g., via rich)
Also I think that when adding some of the HTML reprs, we didn't want to / dare to change the existing repr()
output, but still wanted to take advantage of the features HTML offers.
That said, +1 on ensuring they're more synchronous, but please don't make the HTML repr just an ugly HTML-ified copy of the text repr. It can do much better than that.
I'm still -1 on using different reprs. I can already imagine the confusion when people notice that the output (repr) of an object in the interactive interpreter is widely different from inside a notebook. If you want to add nice HTML summaries I'd really prefer if that had its own dedicated method (e.g. raw.print_html()
or even raw.print()
). Changing the default repr is asking for trouble.
in that case I'd suggest to expand the text reprs such that they all produce tabular output like we do with the HTML repr (and kind-of with the Info text repr already)
cc @vagechirkov who added many of the HTML reprs and has been working on #9220
TBH I don't think we should get a ton of output in a repr of a specific object. Ideally, typing raw
in the interactive interpreter should return just one line with the most important information. Everything in addition should require to call a dedicated method.
If you really want to override a default method (and not add a new one) I'd strongly prefer __str__
over __repr__
.
If you really want to override a default method (and not add a new one) I'd strongly prefer
__str__
over__repr__
.
That's totally alright with me – honestly, I don't care. But I do care about nicely-formatted HTML output, I think it really improves the UX when working interactively. I was even thinking about adding figures to it if plotting can be fast enough (would be collapsed by default)
scikit-learn also has a rich HTML repr for its Pipeline
, which can look like this:
Imagine we had something similar for our ICA …
This is really nice and useful as long as it's not the default repr. And by default I mean either __repr__
or _html_repr_
, because for users the only thing that matters is what is displayed when they type an object in the interactive interpreter or Jupyter notebook. Edit: And by that I mean that both options should give the same result.
This is really nice and useful as long as it's not the default repr. And by default I mean either
__repr__
or_html_repr_
, because for users the only thing that matters is what is displayed when they type an object in the interactive interpreter or Jupyter notebook.
I don't understand the logic here. You say it's really nice and useful but that you don't want users to see it when they type the name of the object (which you say is "the only case that matters")?
I'm also a bit confused why you think we shouldn't "override" the repr of a class when the class is custom / specific to MNE-Python.
Is your position that (1) terminal and html should match and (2) terminal should be short, therefore (3) html cannot be fancy?
Is your position that (1) terminal and html should match and (2) terminal should be short, therefore (3) html cannot be fancy?
Yes, sorry for not being clear, that is my position. I do like nice HTML representations though, and I do want users to see it, only not when they type the name of the object. A repr should be short and self-contained representation of the object, and longer more verbatim information should be available when printing the object.
A repr should be short and self-contained representation of the object, and longer more verbatim information should be available when printing the object.
I believe that most users would prefer to see a repr that HELPS them understand and explore their data, and it's only a minority that want super brief reprs like you :) and you would still be able to get it – simply type repr(foo)
? Nobody loses anything, there's only to gain
We could also produce BOTH in the HTML repr: a short and an "extended" version. And make both of them collapsible
We could also produce BOTH in the HTML repr: a short and an "extended" version. And make both of them collapsible
I was going to suggest something similar: a short terminal repr, and an HTML repr that is a superset of the terminal one: the short bit always displayed and the extra stuff collapsed by default. WDYT @cbrnr?
I was going to suggest something similar: a short terminal repr, and an HTML repr that is a superset of the terminal one: the short bit always displayed and the extra stuff collapsed by default. WDYT @cbrnr?
Maybe I'm not the right person to ask this, but I really don't like this inconsistent behavior. A notebook should behave like a normal interactive interpreter. Sure it has lots of additional features, but these should not silently override default methods. I'd be OK with e.g. print(raw)
or raw._html_repr_()
or raw.html()
to show this additional HTML.
I'd be OK with e.g.
print(raw)
orraw._html_repr_()
orraw.html()
to show this additional HTML.
Only that then we can also just leave it be as nobody will find out about it.
You never print an object?
an HTML repr that is a superset of the terminal one
This is my thinking as well -- it would take care of @agramfort 's gripe about not having keys in _repr_html_
but keep the improved readability of the HTML one. To me HTML is more flexible, so having the information there be a superset of the non-terminal one is justified. So +1 for me on HTML as superset
We could also produce BOTH in the HTML repr: a short and an "extended" version. And make both of them collapsible
the short bit always displayed and the extra stuff collapsed by default.
I don't think this part needs to be codified necessarily, but rather made on a case-by-case basis. The descrption
column in the description/key/value
column triplet is really useful for info, so I'd prefer it be shown. For other classes (ICA?) on the other hand, this might be a nice way to hide more verbose stuff by default but allow showing it.
I really don't like this inconsistent behavior
Sounds like the vote is @cbrnr -1, @drammock +1 @hoechenberger +1 @larsoner +1 on the basic HTML-as-superset idea. @agramfort any opinion?
Only that then we can also just leave it be as nobody will find out about it.
You never print an object?
Sounds like the vote ...
If there is more to consider or say, feel free to continue discussing -- I don't mean to shut down any discussion by asking for other folks to vote if not all major points have been made! To me the pros and cons are pretty clearly discussed or implied by previous comments above in terms of the tradeoffs in consistency, discoverability, usability, and effectiveness of HTML-matches-console versus HTML-superset-of-console, though, and it sounds folks weigh these differently in this case -- hence a vote-and-decide at this point seems reasonable to me here.
Since @hoechenberger mentioned scikit-learn pipelines, they have a config option where users can choose which repr is shown: https://scikit-learn.org/dev/auto_examples/miscellaneous/plot_pipeline_display.html#sphx-glr-auto-examples-miscellaneous-plot-pipeline-display-py
I think this would be a very nice solution where all of us would be happy.
I think this would be a very nice solution where all of us would be happy.
But what if that's not my goal?? 😅🙈 No but seriously, yes, could be a nice middle ground… But I already see us fighting about the "correct" default value there? 🤔
I personally really like xarray HTML representations with collapsible options and find them very useful and transparent. From example:
But what if that's not my goal?? 😅🙈
🤣 I knew it!
No but seriously, yes, could be a nice middle ground… But I already see us fighting about the "correct" default value there? 🤔
Nope, I'm totally fine with having HTML as the default as long as it is configurable (of course I'm also totally fine with text as the default).
I'm also generally +1 for HTML repr being superset of the text one, but a configuration option would be nice. The only case I'm sometimes bothered by long reprs is when displaying lists of mne objects (list of info for example), but in such case the html repr is not used.
I would like to res this issue because I think the current dissociation between the two reprs are still very confusing (at least to me). Specifically, when writing some tutorial in a notebook-like environment, people should see the exact same output whether they are in a REPL or a notebook. In general, I prefer the REPL, but I write my tutorial with Quarto (which uses Jupyter notebook under the hood), so when I tell people to run e.g.
raw.info
I would like to see the REPL __repr__()
and not _repr_html_()
.
Could we add a config option to disable all _repr_html_()
? I think we should because AFAIK it is not possible to change this behavior on the IPython side (https://stackoverflow.com/questions/73540055/how-can-i-force-repr-instead-of-repr-html-in-jupyter-notebooks).
@cbrnr +1 for adding a configuration option if Jupyter doesn't support this. The default should IMHO still be to use HTML repr in a Jupyter environment.
This would be great! And yes, HTML can remain the default for notebook envs.
Does anyone have an idea how to implement this technically? I briefly looked at how it's done in Scikit-learn, but couldn't really figure it out so I guess this will be a larger change.
I will also ask the IPython devs if they think this could be added on their end. Doesn't make a lot of sense that every package has to implement that.
Does anyone have an idea how to implement this technically? I briefly looked at how it's done in Scikit-learn, but couldn't really figure it out so I guess this will be a larger change.
Could you not simply do something like this:
def _repr_html_(self):
if config.no_html_repr:
return repr(self)
?
Unfortunately, this doesn't work (nothing gets printed). It seems like it's not just the returned value that's needed by the notebook.
This works for me in the VS Code Python Interactive Jupyter thingy:
# %%
class Test:
def __repr__(self) -> str:
return 'foobar'
def _repr_html_(self) -> str:
return repr(self)
test = Test()
test
@cbrnr I assume you'll have to escape the <
etc symbols
@cbrnr I assume you'll have to escape the
<
etc symbols
Yep.
# %%
class Test:
def __repr__(self) -> str:
return '<foobar>'
def _repr_html_(self) -> str:
return repr(self)
test = Test()
test
prints nothing.
But this works:
# %%
import html
class Test:
def __repr__(self) -> str:
return '<foobar>'
def _repr_html_(self) -> str:
return html.escape(
repr(self)
)
test = Test()
test
Nice! So I guess we have a plan then! I will prepare a PR soon! Thanks a lot!
Line breaks \n
should not be escaped though – is this possible?
The font is also not a fixed-width font, which ideally would be nice to have as well...
This should do it:
import html
class Test:
def __repr__(self) -> str:
return '<foo>\n<bar>'
def _repr_html_(self) -> str:
r = "<pre>" + html.escape(repr(self)) + "</pre>"
return r.replace("\n", "<br/>")
test = Test()
test
TIL that
Info
has a dedicated HTML representation, which is 🔥. However, I don't like that it doesn't show the same information as the standard text representation, for example: Specifically, I really miss the channel names and I don't like that MEG types are always listed, but ideally there should be no differences at all except for a nicer rendering in HTML. WDYT?