posit-dev / great-tables

Make awesome display tables using Python.
https://posit-dev.github.io/great-tables/
MIT License
1.42k stars 48 forks source link

Support N>1 spanners #345

Closed timkpaine closed 1 month ago

timkpaine commented 1 month ago

Summary

Support more than 1 level of spanners:

from great_tables import GT, exibble
(
    GT(exibble)
    .tab_spanner("A", ["num", "char", "fctr"])
    .tab_spanner("B", ["fctr"])
    .tab_spanner("C", ["num", "char"])
    .tab_spanner("D", ["fctr", "date", "time"])
    .tab_spanner("E", spanners=["B", "C"])
)
Screenshot 2024-05-14 at 19 48 32

NOTE:

Related GitHub Issues and PRs

Checklist

Fixes: https://github.com/posit-dev/great-tables/issues/273 Fixes: https://github.com/posit-dev/great-tables/issues/221

machow commented 1 month ago

Thanks so much--this is really great! Did you want to work on the changes in note? We are more than happy to pick up and work on testing, if you're focused on Loc stuff. This is super helpful--happy to do whatever is most useful!

timkpaine commented 1 month ago
# TODO
from great_tables import GT, exibble

(
    GT(exibble, rowname_col="row", groupname_col="group")
    .tab_spanner("A", ["num", "char", "fctr"])
    .tab_spanner("B", ["fctr"])
    .tab_spanner("C", ["num", "char"])
    .tab_spanner("D", ["fctr", "date", "time"])
    .tab_spanner("E", spanners=["B", "C"])
    # for testing
    .tab_stubhead(label="Group")

)
Screenshot 2024-05-14 at 20 21 18
timkpaine commented 1 month ago

@machow I'm happy to write tests/docs, just need some guidance. The two examples here might be sufficient to assert that their table header structure is created correctly?

from great_tables import GT, exibble

# With stub head
(
    GT(exibble, rowname_col="row", groupname_col="group")
    .tab_spanner("A", ["num", "char", "fctr"])
    .tab_spanner("B", ["fctr"])
    .tab_spanner("C", ["num", "char"])
    .tab_spanner("D", ["fctr", "date", "time"])
    .tab_spanner("E", spanners=["B", "C"])
    # for testing
    .tab_stubhead(label="Group")
)

# Without stub head
(
    GT(exibble)
    .tab_spanner("A", ["num", "char", "fctr"])
    .tab_spanner("B", ["fctr"])
    .tab_spanner("C", ["num", "char"])
    .tab_spanner("D", ["fctr", "date", "time"])
    .tab_spanner("E", spanners=["B", "C"])
)
codecov-commenter commented 1 month ago

Codecov Report

Attention: Patch coverage is 92.00000% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 85.14%. Comparing base (7eae1f2) to head (c8a1889).

Files Patch % Lines
great_tables/_spanners.py 81.81% 2 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #345 +/- ## ========================================== + Coverage 83.51% 85.14% +1.62% ========================================== Files 41 41 Lines 4308 4316 +8 ========================================== + Hits 3598 3675 +77 + Misses 710 641 -69 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

machow commented 1 month ago

@rich-iannone do you mind taking a look?

edit: the examples seem like they hit the key pieces. I'm not too familiar with the exact spanner structure, though 😅

When I pasted the snapshot into codepen (and add borders), it looked like this:

image
rich-iannone commented 1 month ago

I did some testing and I only found one issue with this (aside from the preexisting issue https://github.com/posit-dev/great-tables/issues/273, which could be fixed later). Here's what I used for one of the more comprehensive tests:

(
  GT(exibble_pl, rowname_col="row", groupname_col="group")
  .fmt_number(columns="num")
  .sub_missing("num")
  .cols_label(num = "Number")
  .tab_spanner(label="1st three", columns = ["num", "char", "fctr"], id="one")
  .tab_spanner(label="2nd three", columns = ["date", "time", "datetime"], id="two")
  .tab_spanner(label="Over both", columns = ["num", "char", "fctr"],  spanners="two", id="overbothspanners")
  .tab_spanner(label="3rd level", columns=["char", "num"], id="level3")
  .tab_spanner(label="4th level", columns=["char", "num"], id="level4", )
  .tab_spanner(label="Over two and currency", columns="currency", spanners="two", id="mixed", )
  .tab_spanner(label="Right above currency", columns="currency", level=0, id="over_curr")
  .tab_spanner(label="Two above currency", columns="currency", level=1)
  .tab_spanner(label="REPLACEMENT", columns="currency", level=1, replace=True)
)
spanners_tbl

This is phenomenal work because everything works as expected. The only thing that doesn't is when replace=False is used in the final .tab_spanner() call, which should result in an error. I see in the implementation that this arg is not being used and I think we could definitely punt on that one since it crosses into advanced usage.

rich-iannone commented 1 month ago

Also, the following example fails:

from great_tables import GT, md, html
from great_tables.data import gtcars

gtcars_mini = gtcars[["mfr", "model", "year", "hp", "trq", "msrp"]].tail(10)

(
    GT(gtcars_mini, rowname_col="model", groupname_col="mfr")
    .tab_spanner(label=md("*Performance*"), columns=["hp", "trq"])
    .tab_header(
        title=html("Data listing from <strong>gtcars</strong>"),
        subtitle=html("A <span style='font-size:12px;'>small selection</span> of great cars."),
    )
    .cols_label(year="Year Produced", hp="HP", trq="Torque", msrp="Price (USD)")
    .fmt_integer(columns=["year", "hp", "trq"], use_seps=False)
    .fmt_currency(columns="msrp")
    .tab_source_note(source_note="Source: the gtcars dataset within the Great Tables package.")
)

I traced it to the validation check:

if id in crnt_spanner_ids:
        raise ValueError(f"Spanner id {id} already exists.")

Which is good, but if the id is produced from a label and the label is created with md() we get a Text object. I think we need to handle the creation of id values from Text. Maybe like this?

if id is None:
  if hasattr(label, "text"): # <- We could most probably have a better conditional stmt here
    id = label.text
  else:
    id = label

Then the example runs (but we still get bitten by https://github.com/posit-dev/great-tables/issues/273) and produces a GT table.