pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
43.87k stars 18.02k forks source link

ENH: Adding `\cline`s to LaTeX table if index is not output #59877

Open edbennett opened 2 months ago

edbennett commented 2 months ago

Pandas version checks

Reproducible Example

import pandas as pd
df = pd.DataFrame(
    [
        {"a": 1, "b": 2, "c": 3},
        {"a": 1, "b": 3, "c": 5},
        {"a": 2, "b": 9, "c": 12},
        {"a": 3, "b": 2, "c": 123},
    ]
)
print(
    df.set_index("a").style.hide(axis=0).to_latex(clines="all;data", hrules=True)
)

Issue Description

I am trying to output a LaTeX table with groups separated by \clines but where the grouping variable is not shown in the table. I had thought that having the grouping in the index, but hiding the index in the output table, would allow this to happen; however, instead specifying clines="all;data" does nothing when the index is hidden.

Expected Behavior

I would expect the example above to output

\begin{tabular}{rr}
\toprule
b & c \\
\midrule
2 & 3 \\
\cline{1-2}
3 & 5 \\
\cline{1-2}
9 & 12 \\
\cline{1-2}
2 & 123 \\
\bottomrule
\end{tabular}

Currently it outputs

\begin{tabular}{rr}
\toprule
b & c \\
\midrule
2 & 3 \\
3 & 5 \\
9 & 12 \\
2 & 123 \\
\bottomrule
\end{tabular}

(Ideally I would want it to output

\begin{tabular}{rr}
\toprule
b & c \\
\midrule
2 & 3 \\
3 & 5 \\
\cline{1-2}
9 & 12 \\
\cline{1-2}
2 & 123 \\
\bottomrule
\end{tabular}

but there is a problem with the way that I am creating the index, which is unrelated to the issue at hand.)

Installed Versions

INSTALLED VERSIONS

commit : 71b395f2cf513f7c4ef8b50c608072bf3950e596 python : 3.12.6 python-bits : 64 OS : Darwin OS-release : 23.6.0 Version : Darwin Kernel Version 23.6.0: Mon Jul 29 21:14:30 PDT 2024; root:xnu-10063.141.2~1/RELEASE_ARM64_T6000 machine : arm64 processor : arm byteorder : little LC_ALL : None LANG : en_GB.UTF-8 LOCALE : en_GB.UTF-8

pandas : 3.0.0.dev0+1497.g71b395f2cf numpy : 2.1.1 dateutil : 2.9.0.post0 pip : 24.2 Cython : None sphinx : None IPython : 8.27.0 adbc-driver-postgresql: None adbc-driver-sqlite : None bs4 : None blosc : None bottleneck : None fastparquet : None fsspec : None html5lib : None hypothesis : None gcsfs : None jinja2 : 3.1.4 lxml.etree : None matplotlib : None numba : None numexpr : None odfpy : None openpyxl : None psycopg2 : None pymysql : None pyarrow : None pyreadstat : None pytest : None python-calamine : None pytz : 2024.2 pyxlsb : None s3fs : None scipy : None sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None xlsxwriter : None zstandard : None tzdata : 2024.1 qtpy : None pyqt5 : None

rhshadrach commented 2 months ago

Thanks for the the report. The argument is documented as:

Use to control adding \cline commands for the index labels separation.

As such, I do not believe it is a bug when you have no index labels. I'm reworking this issue as a feature request. At a glance, it seems to make sense to have cline between rows whether the index is present or not.

cc @attack68 for any thoughts.

attack68 commented 2 months ago

I would agree. If this isnt possible already it probably should be if the index is hidden. From memory this was quite complicated though when accounting for (and testing) different cases with hidden indexes and or hidden columns.

rhshadrach commented 1 month ago

Thanks @attack68 - I think PRs to implement this are welcome!