atorus-research / Tplyr

https://atorus-research.github.io/Tplyr/
Other
95 stars 16 forks source link

Inf, -Inf in tplyr table if Min, Max only contain NA values #152

Closed johanneswerner closed 8 months ago

johanneswerner commented 9 months ago

One of my treatment groups only has missing values. Why does the summary table show Inf, -Inf (and is there a possibility to transform this to something more readable)?

As an example, I used the CO2 dataset and added a new row with NA for conc.

library(tidyverse)
library(Tplyr)

data(CO2)
df <- CO2
df <- df %>%
  add_row(
    Plant = "Mc4",
    Type = "Colorado",
    Treatment = "nonchilled",
    conc = NA,
    uptake = NA
  )

tplyr_table(df, Plant) %>% 
  add_layer(
    group_desc(conc, by = "Treatment", where = Type == "Colorado") %>% 
      set_format_strings(
        "n"        = f_str("xx", n),
        "Mean (SD)"= f_str("xx.x (xx.xx)", mean, sd),
        "Median"   = f_str("xx.x", median),
        "Q1, Q3"   = f_str("xx, xx", q1, q3),
        "Min, Max" = f_str("xx, xx", min, max),
        "Missing"  = f_str("xx", missing)
      )
  ) %>% 
  build()

And here is the Tplyr output

# A tibble: 6 × 18
  row_label1 row_label2 var1_Mc1 var1_Mc2 var1_Mc3 var1_Mc4    var1_Mn1 var1_Mn2 var1_Mn3 var1_Qc1 var1_Qc2 var1_Qc3
  <chr>      <chr>      <chr>    <chr>    <chr>    <chr>       <chr>    <chr>    <chr>    <chr>    <chr>    <chr>   
1 Treatment  n          ""       ""       ""       " 1"        ""       ""       ""       ""       ""       ""      
2 Treatment  Mean (SD)  ""       ""       ""       ""          ""       ""       ""       ""       ""       ""      
3 Treatment  Median     ""       ""       ""       ""          ""       ""       ""       ""       ""       ""      
4 Treatment  Q1, Q3     ""       ""       ""       ""          ""       ""       ""       ""       ""       ""      
5 Treatment  Min, Max   ""       ""       ""       "Inf, -Inf" ""       ""       ""       ""       ""       ""      
6 Treatment  Missing    ""       ""       ""       " 1"        ""       ""       ""       ""       ""       ""      
# ℹ 6 more variables: var1_Qn1 <chr>, var1_Qn2 <chr>, var1_Qn3 <chr>, ord_layer_index <int>, ord_layer_1 <int>,
#   ord_layer_2 <int>
mstackhouse commented 9 months ago

This is a duplicate of #21

This is a side effect of using na.rm=TRUE on the backend:

> min(c(NA), na.rm=TRUE)
[1] Inf
Warning message:
In min(c(NA), na.rm = TRUE) :
  no non-missing arguments to min; returning Inf

Would you prefer that this creates an empty string? Or how would you like it represented? Currently your best bet is to post process the strings. But there's two places we could address it in Tplyr's process: