rstudio / gt

Easily generate information-rich, publication-quality tables from R
https://gt.rstudio.com
Other
2.02k stars 205 forks source link

gt Table Formatted Incorrectly when I Knit to PDF #1257

Closed vishalbakshi closed 1 year ago

vishalbakshi commented 1 year ago

Prework

Description

Hi all. First of all thanks for creating and maintaining such a useful library. I am still new to gt and RMarkdown so if I should post this in the RStudio/Posit forum or open an issue in the knitr repo, please let me know. I also did not find an issue similar to this anywhere over the past few days but if there's an existing similar issue I can post this there instead.


In RStudio, when I Knit my Rmarkdown file to Word, my gt table exports correctly and fits on the document page nicely with the first column's text word wrapped.

image

When I Knit to PDF, the first column doesn't word wrap, and I lose the colormap I specified in data_color:

image

From what I understand, tab_style works only for HTML output but that restriction doesn't apply to data_color.

Reproducible example


I have created a GitHub repo containing a reproducible example, and am posting the code here as well:

Setup

# data analysis imports
library(tidyverse)
library(magrittr)

# formatting tables
library(gt)

# color palettes
library(scales)

# saving gtables as images
library(webshot2)
library(rmarkdown)

Create Fake Data

data <- data.frame(
  outcome_statement = c(
    "This is a very long outcome statement which will require a wide column if it remains in one line",
    "This is a second very long outcome statement which will require a wide column if it remains in one line",
    "This is a third very long outcome statement which will require a wide column if it remains in one line"),
  value_1 = c(1000,2000,3000),
  value_2 = c(450000,300,5000),
  value_3 = c(8500, 750, 25)
)

Create formatted gt table

data %>%
    gt() %>% 
    tab_options(
        column_labels.font.weight = "bold",
        table.width = pct(100)
    ) %>%
    tab_style(
        style = cell_borders(
            sides = c("bottom", "right"),
            color = "#bfbfbf"
        ),
        locations = cells_body(
            columns = everything(),
            rows = everything()
        )
    ) %>%
    tab_header(
        title = "Three Values for Different Outcomes"
    ) %>%
    data_color(
        columns = c(
            value_1, 
            value_2,
            value_3),
        #colors = col_bin(colorRamp(c("#fff8eb", "#fdb734"), interpolate="spline"), domain = c(0,5005), bins = 6)
        colors = col_factor(colorRamp(c("#fff8eb", "#fdb734"), interpolate="spline"), domain = NULL)
    ) %>% 
    tab_footnote(
        footnote = "Description of Value 1",
        locations = cells_column_labels(
            columns = value_1
        )
    ) %>%
    tab_footnote(
        footnote = "Description of Value 2",
        locations = cells_column_labels(
            columns = value_2
        )
    ) %>%
    tab_footnote(
        footnote = "Description of Value 3",
        locations = cells_column_labels(
            columns = value_3
        )
    ) %>%
    cols_label(
        outcome_statement = "Outcome Statement",
        value_1 = "Value 1",
        value_2 = "Value 2",
        value_3 = "Value 3"
    )

Expected result

I expect the PDF to display the table as it is viewed (or close to) in my .Rmd:

image

Session info

R version 4.2.2 (2022-10-31) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Big Sur 11.6

Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] rmarkdown_2.20 webshot2_0.1.0 scales_1.2.1 gt_0.8.0 magrittr_2.0.3 lubridate_1.9.2 forcats_1.0.0 stringr_1.5.0
[9] dplyr_1.1.0 purrr_1.0.1 readr_2.1.4 tidyr_1.3.0 tibble_3.2.0 ggplot2_3.4.1 tidyverse_2.0.0

loaded via a namespace (and not attached): [1] Rcpp_1.0.10 later_1.3.0 pillar_1.8.1 compiler_4.2.2 tools_4.2.2 digest_0.6.31 jsonlite_1.8.4
[8] timechange_0.2.0 evaluate_0.20 lifecycle_1.0.3 gtable_0.3.1 pkgconfig_2.0.3 rlang_1.1.0 cli_3.6.0
[15] rstudioapi_0.14 yaml_2.3.7 xfun_0.37 fastmap_1.1.1 withr_2.5.0 knitr_1.42 sass_0.4.5
[22] generics_0.1.3 vctrs_0.6.0 hms_1.1.2 websocket_1.4.1 grid_4.2.2 tidyselect_1.2.0 chromote_0.1.1
[29] glue_1.6.2 R6_2.5.1 processx_3.8.0 fansi_1.0.4 farver_2.1.1 tzdb_0.3.0 ps_1.7.2
[36] promises_1.2.0.1 htmltools_0.5.4 ellipsis_0.3.2 colorspace_2.1-0 utf8_1.2.3 stringi_1.7.12 munsell_0.5.0

srowelluf commented 1 year ago

Yes - the wrapping piece would be a huge help if were fixed.

I made note if this in a discussion in November, with code using mtcars; but you did a better job of describing wrapping as the issue, where I noted that I found the issue in wider tables. I've been working around just be trying to be very careful of my column names if I'm going to PDF instead of HTML (or just knitting to word and then saving as PDF), but fixing this would be a huge help.

https://github.com/rstudio/gt/discussions/1111

vishalbakshi commented 1 year ago

Thanks! I was so focused on looking for data_color related issues that I missed that one.

eteitelbaum commented 1 year ago

I cannot find a package that entirely solves this problem. flextable has the same problem with text not wrapping and huxtable, while the text wraps, has other formatting challenges in pdf. As an academic I have to produce pdf documents. Thus it would make such a huge difference to my workflow to have one package that could produce nice html and pdf tables inside of Quarto (as opposed to having to produce them in separate files and import them).

rich-iannone commented 1 year ago

There has been some recent work to make this better (you have to get the development version of gt with devtools though). Setting explicit column widths now works in LaTeX/PDF. Background cell colors now also works. Here is the revised code and a screenshot of an RMarkdown (should be equivalent for Quarto) render:

---
output: pdf_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(gt)
library(scales)
data <- data.frame(
  outcome_statement = c(
    "This is a very long outcome statement which will require a wide column if it remains in one line",
    "This is a second very long outcome statement which will require a wide column if it remains in one line",
    "This is a third very long outcome statement which will require a wide column if it remains in one line"),
  value_1 = c(1000,2000,3000),
  value_2 = c(450000,300,5000),
  value_3 = c(8500, 750, 25)
)

data %>%
    gt() %>% 
    tab_options(
        column_labels.font.weight = "bold",
        table.width = pct(100)
    ) %>%
    tab_style(
        style = cell_borders(
            sides = c("bottom", "right"),
            color = "#bfbfbf"
        ),
        locations = cells_body(
            columns = everything(),
            rows = everything()
        )
    ) %>%
    tab_header(
        title = "Three Values for Different Outcomes"
    ) %>%
    data_color(
        columns = c(
            value_1, 
            value_2,
            value_3),
        #colors = col_bin(colorRamp(c("#fff8eb", "#fdb734"), interpolate="spline"), domain = c(0,5005), bins = 6)
        fn = col_factor(colorRamp(c("#fff8eb", "#fdb734"), interpolate = "spline"), domain = NULL)
    ) %>% 
    tab_footnote(
        footnote = "Description of Value 1",
        locations = cells_column_labels(
            columns = value_1
        )
    ) %>%
    tab_footnote(
        footnote = "Description of Value 2",
        locations = cells_column_labels(
            columns = value_2
        )
    ) %>%
    tab_footnote(
        footnote = "Description of Value 3",
        locations = cells_column_labels(
            columns = value_3
        )
    ) %>%
    cols_label(
        outcome_statement = "Outcome Statement",
        value_1 = "Value 1",
        value_2 = "Value 2",
        value_3 = "Value 3"
    ) %>%
  cols_width(
    outcome_statement ~ px(250),
    starts_with("value") ~ px(30)
  )


<img width="1210" alt="gt_cols_width_latex" src="https://github.com/rstudio/gt/assets/5612024/d1d42ad8-3529-4857-8b54-b4a5f07ecd6c">

I'm hoping to have something better with LaTeX soon (so it behaves a bit more like the HTML automatic column widths) but this is step in the right direction.
eteitelbaum commented 1 year ago

Thanks! I was able to get this to run in a .Rmd file but not in Quarto. I kept getting this message:

compilation failed- error
Illegal unit of measure (pt inserted).
<to be read again> 
                   p
l.164 O
       utcome Statement & Value 1\textsuperscript{\textit{1}} & Value 2\text... 

see gt_test.log for more information.

Here are the last few lines of the log file:

LaTeX Font Info:    Trying to load font information for U+msb on input line 156.
(c:/Users/emman/AppData/Roaming/TinyTeX/texmf-dist/tex/latex/amsfonts/umsb.fd
File: umsb.fd 2013/01/14 v3.01 AMS symbols B
) (c:/Users/emman/AppData/Roaming/TinyTeX/texmf-dist/tex/latex/microtype/mt-msb.cfg
File: mt-msb.cfg 2005/06/01 v1.0 microtype config. file: AMS symbols (b) (RS)
)
! Illegal unit of measure (pt inserted).
<to be read again> 
                   p
l.164 O
       utcome Statement & Value 1\textsuperscript{\textit{1}} & Value 2\text... 
rich-iannone commented 1 year ago

I think that LaTeX doesn’t handle px units. Trying using pt units (like this "30pt") instead in the cols_width() call.

eteitelbaum commented 1 year ago

Yes, this works:

cols_width(
    outcome_statement ~ "250pt",
    starts_with("value") ~ "30pt"
)

Thanks!

I know this is a different topic, but would it be for a similar reason that I cannot get table.font.size to work with a PDF? I have tried using point units and the pct() function as well.

rich-iannone commented 1 year ago

The table.font.size option doesn't work yet in LaTeX/PDF because the 'feature' still needs to be added. PDF is pretty fussy with font sizes in general so this addition (which is on the radar) will require some effort and testing time. I'll make a separate issue to track that but I'll close this one (preferring focused issues for LaTeX/PDF features).