posit-dev / great-tables

Make awesome display tables using Python.
https://posit-dev.github.io/great-tables/
MIT License
1.87k stars 70 forks source link

Feature request: add argument to `data_color` that truncates values outside of the domain instead of setting them to `nan` #430

Open thomascamminady opened 2 months ago

thomascamminady commented 2 months ago

Currently, values that are outside the prescribed domain are going to be treated like nan/null values inside data_color:

image

(ref: https://posit-dev.github.io/great-tables/reference/GT.html#great_tables.GT.data_color)

I think it would be a useful feature to allow a flag that enables a different behavior: Instead of treating the values as nan/null, they should be truncated to the min/max values of the domain. E.g.:

gt.GT(gt.data.exibble).data_color(
    columns="currency",
    palette=["red", "green"],
    domain=[0, 50],
    na_color="lightgray"
    truncate=True          # <<<<<<<<<<< This could be the new flag.
)

Then the entries 65100.0 and 1325.81 would get the same color as a value of 50.

If this is a feature that makes sense, the _rescale_numeric could be a good place to add this to. https://github.com/posit-dev/great-tables/blob/11660a66b9291e137db93a74b985941f4a63ad90/great_tables/_data_color/base.py#L572

The change could look like this:


    if truncate: 
        scaled_vals = [min(1, max(0, x)) if not is_na(df, x) else np.nan for x in scaled_vals]    
    else:
        scaled_vals = [x if not is_na(df, x) and (x >= 0 and x <= 1) else np.nan for x in scaled_vals]
AdrienDart commented 1 month ago

Hi, on top of that, a 'center' argument could be nice to center the colormap. It could also be nice to use polars expression to define the domain. Thank you for your help!