Closed TomAugspurger closed 3 years ago
@TomAugspurger I had to slightly change style.rst
to get this to appear in the index. You may want to further modify.
A comment on the API of the highlighter:
def color_negative_red(val):
"""
Takes a scalar and returns a string with
the css property `'color: red'` for negative
strings, black otherwise.
"""
color = 'red' if val < 0 else 'black'
return 'color: %s' % color
This basically only works for html representation ("color: red" is css speak) and assumes that the return value should got to the css.
In latex (see e.g. this example), you prefix/suround the value with a command:
[...]
\usepackage[table]{xcolor}% http://ctan.org/pkg/xcolor
Some & \cellcolor{blue!25}coloured & contents \\
would render the second cell blue (by using a command from a special package).
So for latex you probably need templates ala \cellcolor{blue!25}%s
or \textbf{%s}
(which make the value bold).
@TomAugspurger I think you made a great start on this! A few ideas for making this approach potentially more flexible:
Allow extra attributes on the <table>
.
Common use case: You want to export an html table with sortable column, e.g. using sortable. For that you would need the following opening tag:
<table class="sortable-theme-bootstrap" data-sortable>
Support "external" styling via existing custom CSS style sheets.
For this to work, the current approach with unique ids per cell which are then targeted with a auto generated CSS doesn't really work. Instead it would be useful to be able to assign additional classes (or data attributes) to cells based on the Styler
rules.
Example use case: assign class positive
to all values > 0 and negative
to all values < 0
A potential advantage of this would be much smaller size of generated code (since we don't need a custom CSS block for each cell) and better performance when rendering in the browser (fewer rules to apply).
Allow setting arbitrary attributes on table cells based on rules.
This extends the previous suggestion and would allow using exported html tables with any kind of JavaScript library using specific attributes.
Sorry for being so late in making these suggestions - I didn't manage to read through all of #10250 before it was merged.
@kynan fantastic, thanks for the feedback. I'll go through it in more detail later.
Your item 1. sounds pretty simple. Is attribute
the correct term for the items in the opening tag? <table class="sortable-theme-bootstrap" data-sortable>
We could include a method like .set_table_attributes
for that.
@TomAugspurger I believe attribute is the common term and also the one the W3C uses. set_table_attributes
sounds sensible to me.
@TomAugspurger merged your PR; I'll leave you to close when you are ready.
Thanks, I want to get a better solution in place for including notebooks in the sphinx build, but that works for now.
@kynan, for your second item, assigning classes to cells. My current thinking is to have a method on Styler
(would it be terrible to call it .classify
?) that takes a function to be evaluated and a class to assign to the cells where that function evaluates to True
. The limitation here is that the class that's assigned doesn't get to refer to the data. So you couldn't (easily) do something like our .background_gradient
, which is why I discarded this approach originally. But it might make sense to have in addition to the one-class-per cell approach we have. This will probably need to wait for the next release though.
docs are up if anyone sees anything. It does still have the [In] and [Out] tags and ¶ markers I might be able to hide.
@TomAugspurger yep look great!
on the css side, I think its possible if we tag with the SAME names, e.g.
df.style.highlight_null(css='null_class').background_gradient(css='gradients_class')
then as long as you tag THOSE cells with that class it would work. we could have default class names (based on the function name), and have this kw to override.
The df.style.highlight_null(css='null_class')
I could see working like that, since that fits the binary "if this condition is true, apply this style".
This could be my limited understanding of CSS, but I don't see how you could accomplish df.style.background_gradient(css='gradients_class')
in CSS, just knowing that these columns have this class. (I think) you'd need a class per color you want to assign.
I suppose we could add a data attribute to each cell with the value of that cell... You might be able to pull off some CSS wizardry to accomplish it in that case, but I don't see the average python use being able to write or customize that.
@TomAugspurger I think you would actually construct the classes WITH the in this case the level embedded, (for some you wouldn't need to do this), maybe something like 'gradient_level_0_class` (e.g. say you ten levels of gradient. but this is a refinement.
I've been playing around with the new styling features and have a few comments, overall this is a great new addition.
The highlight_min
, highlight_max
and highlight_null
would be a lot better if instead of taking a color
argument they would actually take the css
format string or **kwargs
that correspond to css style names - this would allow:
background-color: black; color: white
)font-weight: bold
)The documentation is also a little confusing in terms of debugging the styling functions.
Debugging Tip: If you're having trouble writing your style function, try just passing it into
df.apply
.Styler.apply
uses that internally, so the result should be the same.```
The full stop and space between apply
and Styler
is somewhat difficult to spot (I only noticed it when pasting the quote here) - I was looking for df.apply.Styler.apply
which obviously doesn't exist, a little rewording would fix this. (You need to look at the text at https://pandas-docs.github.io/pandas-docs-travis/style.html to spot it)
Debugging Tip: If you're having trouble writing your style function, try passing it into
df.apply
. InternallyStyler.apply
uses that, so the result should be the same.```
On your first point, I agree that would be useful. If we end up going with a .classify
that takes a function returning booleans (pd.isnull
) and assigns classes where that's True, we should be able to handle that pretty easily. I think we should hold off adding more keywords to .highlight_null
until we decide what to do there.
I just pushed a PR to clarify the documentation. That was confusing, thanks.
@TomAugspurger to repeat, awesome work!
A question on the 'provisional status'. First, as I said on gitter, I think it is a good idea to put the same provisional note from the notebook in the whatsnew note (experimental = can still change + feedback wanted). Secondly, we could also emit a warning about this on first usage to be even more explicit? (but maybe that's a bit too intrusive). But if it is only on the first import of the style module, and not each time you use it, maybe it is OK?
Question for the docs: the built notebook in html form is still in the source code. Is this on purpose? (as eg in the latest PR you only updated the notebook and not the html file)
Having the generated HTML is not intended, I thought I deleted that. I'll remove it in my PR adding the provisional note.
For the warning. I never was a fan of always getting the warnings when using IPy widgets. I can go either way though. At the very least I'm going to add a note to the docstring for Styler.
Another small note on the docs: maybe it would be good to include a link the notebook on nbviewer? As this actually still looks better than the one included in the docs (the table styling (the borders) is 'uglier')
I was trying to figure out how to include a link that points to the same version of the notebook, but adding the link changes the notebook :) I suppose we just link to https://github.com/pydata/pandas/blob/master/doc/source/html-styling.ipynb
, understanding that the contents at that URL can change?
@TomAugspurger we can host a rendered version on pandas.pydata.org
easily in the doc directory and just link to it (from the docs). IIRC this was your original suggestion :)
github doesn't render these properly AFAICT
I think putting a link 'See this notebook on nbviewer' that points to the one in master is OK (to be fully correct, it should point to the version in the version tag, but that is bit difficult as that does not yet exist :-))
Just iterate on the links until they converge :)
Just pushed https://github.com/pydata/pandas/pull/11664 with an NBviewer link pointing to master until we get the version uploaded to pandas.pydata.org
as part of the doc build.
hmm, so this is the argument then for including the nbconvert outputted .html
, which we can then directly link as a file
A possible advantage of using the link to nbviewer, is that it is then easier to download the notebook to run things yourself
Yeah I think we should definitely link to nbviewer since they already have the stuff in place to download as a link. I'm not sure how including the rendered HTML in the doc build helps (or hurts) with this.
OK, merging the PR then!
The first SO questions appear! :-) http://stackoverflow.com/questions/33875937/apply-number-formatting-to-pandas-html-css-styling
@TomAugspurger let's try to link all relevant HTML issues at the top of the tracker (as most/maybe all can be accomplished via .style
).
xref #11700
@TomAugspurger Sorry for the delay. classifier
sounds good to me. Is there a reason it could only return a boolean and not a string with the class name?
This feature looks really promising. Thanks all who worked on it!
Wouldn't it be great if I could render DataFrames in html outside of ipython notebooks!? I'm not a fan of developing inside ipython notebook, and working with matplotlib entails a lot of overhead.
Two workflows that come to mind are as follows. First, if you are working on a mac, keep a quicklook window open on a PDF file that you use to store current output. Then, define a my_print
function that render()
s an html string and prints it to your PDF file:
from weasyprint import HTML
def my_style(frame):
return frame.style.highlight_null(null_color='red') # or whatever
def my_print_pdf(frame, styler, filename='/Users/username/temp/frame_viewer.pdf'):
style = styler(frame)
html = HTML(style.render())
html.write_pdf(filename)
return None
The Quicklook should update each time you my_print
a DataFrame.
Second, use something like Browsersync to watch an HTML file. To watch a file with Browsersync, you'd type the following in your terminal:
cd ~/temp
browser-sync start --server --index "frame_viewer.html" --files "*.html"
With this approach, you'd write a my_print
that dumps the output from render()
to an html file. Because Browsersync expects body tags, you'll need to append those to the output from render
:
def my_print_html(frame, styler, filename='/Users/username/temp/frame_viewer.html'):
style = styler(frame)
html = "<html><head></head><body>" + style.render() + "</body></html>"
with open(filename, 'w') as f:
print(html, file=f)
return None
Each time you call my_print_html
Browsersync will refresh your browser automatically.
Notes: code not tested.
Edits 1: tested Browsersync example and updated code so it works on my machine. Edits 2: updated browsersync command.
Those are both possible now right? You'd just need to write the code / startup the server? Something like your code would fit well in the cookbook documentation.
Adding "builtin" support for rendering in the notebook had a very favorable cost-benefit ratio. Two lines of code for something used by so many. I don't anticipate adding other backends.
On Nov 26, 2015, at 15:30, 121onto notifications@github.com wrote:
This feature looks really promising. Thanks all who worked on it!
Wouldn't it be great if I could render DataFrames in html outside of ipython notebooks!? I'm not a fan of developing inside ipython notebook, and working with matplotlib entails a lot of overhead.
Two workflows that come to mind are as follows. First, if you are working on a mac, keep a quicklook window open on a PDF file that you use to store current output. Then, define a my_print function that render()s an html string and prints it to your PDF file:
from weasyprint import HTML
def my_style(frame): return frame.highlight_null() # or whatever
def my_print_pdf(frame, styler, filename='~/temp/frame_viewer.pdf'): style = styler(frame) html = HTML(style.render()) html.write_pdf(filename) return None The Quicklook should update each time you my_print a DataFrame.
Second, use something like Browsersync to watch an HTML file. To watch a file with Browsersync, you'd type something like browser-sync start --server --files "~/temp/frame_viewer.html" in the terminal. With this approach, you'd write a my_print that dumps the output from render() to an html file. Because Browsersync expects body tags, you may need to append those:
def my_print_html(frame, styler, filename='~/temp/frame_viewer.html'): style = styler(frame) html = "
" + style.render()) + "" with open(filename, 'w') as f: print(html, file=f)return None
Notes: code not tested.
— Reply to this email directly or view it on GitHub.
@TomAugspurger yes, possible right now. The Browsersync example should work now.
@TomAugspurger just came across this from xlsxwriter
here
might be a nugget we could steal....
This is an interesting approach in R: https://github.com/renkun-ken/formattable -> see the last example
Hi,
This feature is great - thanks.
I wanted a way to style a column based on data in another column. I couldn't see a way to do this so made a change to the .bar()
styler method. Suggestions on how else to perform such a thing would be appreciated.
Apologies if I have not done this correctly.
@joekane3 something like that should be possible through the .apply
method. Your style function will get the entire DataFrame, so you can use the values in column A
to apply styles in columns B
. Your function should return a DataFrame with background-color: <color>
for column B
and empty strings everywhere else.
Of course, you'll have to do all the conversion from values to colors on your own. Pandas just uses matplotlib internally, so that's probably your best bet.
I'm new to github, sorry if this is wrong place to post this. I don't think there is but is there a way to hide the index column when outputting a styled DataFrame through the .render() function?
Also it seems like the Styler is going to (in the future) make .to_html() obsolete.
Correct, the index is unstyleable right now. I plan to fix that in the future, including an option to hide it.
We'll always have to_html, but the implementation might reuse the code here.
Correct, the index is unstyleable right now. I plan to fix that in the future, including an option to hide it.
@TomAugspurger : Was there ever any headway on this? I can't find mention of it in the docs. I've had to do some pretty hacky css to hide the index while using the styler.
actually a bit of work here: https://github.com/pydata/pandas/issues/11655
@mjmyers nothing for styling the index yet though. The The big thing is finding an API that's nice to work with. Some possibilities
target={data,index,columns}
keyword to relevant functions for what to styleapply_axis/apply_labels
or apply_index/apply_columns
df.style.index/columns.<method>
But I haven't thought too much about it yet.
since this tracker is quite old and many items on it have been addressed or evolved I will close it in favour of more recent discussion. pls re open if you consider it useful.
Follows #10250
For 0.17.1
doc/source/html-styling.html
and find a way to includedoc/source/html-styling.ipynb
in the doc build (should use--template=basic
)print_versions
requirements_all.txt
to include jinjaFor 0.18.0 / Future
img
tags, urls, etc. flows into...Styler.template
into smaller blocks. Let people extend that. We could (maybe) allow users to choose which template to use to render each column/cell with solving the template modification problempd.options
, allow setting of default reprs with stylesStyler
into aBaseStyler
, maybe add a LaTeX styler (maybe deprecate / replace theto_html
andto_latex
methods; Jinja templates are much more pleasant to work with), xref #11700will add more as we go.