pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
43.19k stars 17.77k forks source link

ENH: colormap support for `DataFrame.to_latex()` #45020

Closed Zac-HD closed 2 years ago

Zac-HD commented 2 years ago

I've recently written a paper where we converted some of our sns.heatmap() figures to true LaTeX tables, keeping the cell coloring:

drawing

It would be lovely if Pandas could do this; while my implementation is pretty ugly @rsokl suggested that colormapped tables could improve many papers.

Concretely, I think this would involve adding two new arguments, cmap and norm, to DataFrame.to_latex(); interpreted in the usual way for plotting functions. That's a non-breaking API change, though it would need some consideration of what should happen (probably error messages) for non-numeric inputs - I also had a duration (timedelta) table colored by .total_seconds(), but it seems reasonable to push that back to custom user code.

def value_to_table_cell(n):
    txt = f'{int(n):,}'  # whatever format the user supplied
    if np.isnan(n):
        return txt  # unclear how NaN should be colored; default no color

    # I'm sure there's a more elegant way to do this, but we had a deadline...
    colors = matplotlib.cm.inferno_r.colors
    x = LogNorm(vmin=0.5, vmax=200)(n + 0.5)
    idx = int(x * len(colors))
    r, g, b = colors[max(0, min(idx, len(colors)-1))]

    # If the selected color is darker than 50% grey, make the text white
    color = "" if r + b + g >= 1.5 else  r"\color{white} "

    # Put it all together!  And don't forget to \usepackage{colortbl}
    return f"\\cellcolor[rgb]{{{r},{g},{b}}} {color}{txt}"

Also happy to close if this is just out of scope 😁

attack68 commented 2 years ago

Why not try Styler?

df = DataFrame(
    [[3, 0, 0, 8, 3, 78, 0, 4, 1, 1],
    [np.nan, 0,0,9,1,80,0,4,1,1]], 
    columns=["cccatalog", "covid19_japan", "disease_sh", "jupyter_server", 
            "jupyyerhub", "open_fec", "request_baskets", "restler_demo", 
             "worklog", "age_of_empires" ],
    index=["state-machine", "hand-written"],
    dtype=float)
styler = df.style\
  .background_gradient(axis=None, cmap="inferno", gmap=-np.log(df.values+1), vmin=-5.5)\
  .highlight_null(props="background-color:white; color:white;")\
  .format(precision=0, na_rep="")\
  .format_index(escape="latex", axis=1)\
  .applymap_index(lambda v: "rotatebox:{90}--rwrap--latex; transform: rotate(-90deg) translateX(-32px); height:100px; max-width:25px", axis=1)
styler

Screen Shot 2021-12-23 at 21 10 53

>>> print(styler.to_latex(
    convert_css=True, hrules=True, position_float="centering",
    caption="Switching to Hypothesis' state-machines made.",
))

\begin{table}
\centering
\caption{Switching to Hypothesis' state-machines made.}
\begin{tabular}{lrrrrrrrrrr}
\toprule
 & \rotatebox{90}{cccatalog} & \rotatebox{90}{covid19\_japan} & \rotatebox{90}{disease\_sh} & \rotatebox{90}{jupyter\_server} & \rotatebox{90}{jupyyerhub} & \rotatebox{90}{open\_fec} & \rotatebox{90}{request\_baskets} & \rotatebox{90}{restler\_demo} & \rotatebox{90}{worklog} & \rotatebox{90}{age\_of\_empires} \\
\midrule
state-machine & {\cellcolor[HTML]{F98C0A}} \color[HTML]{F1F1F1} 3 & {\cellcolor[HTML]{FCFFA4}} \color[HTML]{000000} 0 & {\cellcolor[HTML]{FCFFA4}} \color[HTML]{000000} 0 & {\cellcolor[HTML]{DD513A}} \color[HTML]{F1F1F1} 8 & {\cellcolor[HTML]{F98C0A}} \color[HTML]{F1F1F1} 3 & {\cellcolor[HTML]{440A68}} \color[HTML]{F1F1F1} 78 & {\cellcolor[HTML]{FCFFA4}} \color[HTML]{000000} 0 & {\cellcolor[HTML]{F57B17}} \color[HTML]{F1F1F1} 4 & {\cellcolor[HTML]{F9C932}} \color[HTML]{000000} 1 & {\cellcolor[HTML]{F9C932}} \color[HTML]{000000} 1 \\
hand-written & {\cellcolor[HTML]{000000}} \color[HTML]{F1F1F1} {\cellcolor{white}} \color{white}  & {\cellcolor[HTML]{FCFFA4}} \color[HTML]{000000} 0 & {\cellcolor[HTML]{FCFFA4}} \color[HTML]{000000} 0 & {\cellcolor[HTML]{D74B3F}} \color[HTML]{F1F1F1} 9 & {\cellcolor[HTML]{F9C932}} \color[HTML]{000000} 1 & {\cellcolor[HTML]{420A68}} \color[HTML]{F1F1F1} 80 & {\cellcolor[HTML]{FCFFA4}} \color[HTML]{000000} 0 & {\cellcolor[HTML]{F57B17}} \color[HTML]{F1F1F1} 4 & {\cellcolor[HTML]{F9C932}} \color[HTML]{000000} 1 & {\cellcolor[HTML]{F9C932}} \color[HTML]{000000} 1 \\
\bottomrule
\end{tabular}
\end{table}

Screen Shot 2021-12-23 at 21 11 56

Zac-HD commented 2 years ago

Oh, nice! I didn't realize that Styler could export to LaTex (just checked; new-in-1.3 explains that).

Let's consider this a docs issue - adding a short note to .to_latex() mentioning that Styler gets exported properly would have pointed me in the right direction - or just close as "already implemented". Thanks for the super-fast and helpful response! 😍

attack68 commented 2 years ago

1.4.0 already has docs changes, and issues a warning that 2.0 will use the Styler implementation and replace the DataFrameLatexFormatter, since it has much more flexibility.