spgarbet / tangram

Table Grammar package for R
66 stars 3 forks source link

Transforms: can't overwrite summarize_chisq #70

Closed bogie closed 3 years ago

bogie commented 3 years ago

Tangram version: 0.7.1 R Version: 4.0.3

Error:

<error/tibble_error_incompatible_new_rows> New rows can't add columns. x Can't find column cell_style[["fraction"]](...) in .data. Backtrace:

  1. tangram::tangram(...)
  2. tangram:::tangram.formula(...)
  3. tangram:::cell_create_table(...)
  4. base::sapply(...)
  5. base::lapply(X = X, FUN = FUN, ...)
  6. tangram:::FUN(X[[i]], ...)
  7. base::sapply(...)
  8. base::lapply(X = X, FUN = FUN, ...)
  9. tangram:::FUN(X[[i]], ...)
    1. global::transform(...)
    2. tibble::add_row(...)

So I recently created a new project, loaded the libraries as usual, created my data frames(or rather tibbles) created my tansform and tried to create a table with the following command:

ayham.all %>% tangram(x=ARDSgroup ~ Symptom_Fever + Symptom_Cough + Symptom_Dyspnea + Symptom_Fatigue + Symptom_Tiredness +
                          Symptom_Hemoptysis + Symptom_Rhinorrhoea + Symptom_SoreThroat + Symptom_Pharyngalgia +
                          Symptom_AnginaPectoris + Symptom_Myalgia + Symptom_Headache + Symptom_Ageusia + Symptom_Anosmia + Symptom_Diarrhoea +
                          Symptom_Nausea + Symptom_Vomitting + Symptom_AbdominalPain,
                      data=., transform = tf, capture_units=F, collapse=T, test=T, digits = 2)

I previously created my transform by copy-pasting the entire lancet transform to my .R file and changing summarize_kruskal_horz and summarize_chisq. This still works in my other project(where I do the entire process each time I knit my markdown), but in the new project it started failing with the above error. Copying the dataframe over to the other project as a RDS file will not show an error.

So I went to your documentation again and modified creating my transform the following way: tf <- lancet

This works!

I then ran tangram::summarize_chisq to get the function code, copied it in my .R file and without modifying it further ran the following(it doesnt matter if I rename the summarize_chisq function to something else):

tf$Categorical$Categorical <- summarize_chisq

running the code again produces above error message. Overwriting summarize_kruskal_horz works without a problem.

tf$Cell$fraction gives the expected output:

> tf$Cell$fraction
function (numerator, denominator, format = NULL, ...) 
{
    percent <- 100 * numerator/denominator
    if (is.na(format) || is.null(format)) 
        format <- format_guess(percent)
    percent <- render_f(percent, format)
    num <- gsub("(?!^)(?=(?:\\d{3})+(?:\\.|$))", " ", 
        perl = TRUE, str_pad(numerator, nchar(as.character(denominator))))
    cell(paste0(num, " (", center_decimal(percent), "%)"), 
        ...)
}

I even tried saving my transform as a .RDS and loading it in the new project -> same error I actually only want to change from a chisq test without correction to either a fisher's exact or a chisq.test with monte carlo test(as I did in my old project)

spgarbet commented 3 years ago

Can you provide a simple reproducible example I could run? This shouldn't be happening and is surely a bug. A quick workaround if all you want is the test changed is to use the test argument to these transforms. If you have multiple different types you'll have to write your own dispatcher to deal with various permutations, but the data will be provided to the function and you can do what ever test you desire. say you have test=FUN as the argument, it will be called as follows: FUN(row, column, cell_style, ...)

bogie commented 3 years ago

I am utterly confused, after debugging this for two days, on different machines after creating an entirely new project and everything, the error just disappeared after restarting RStudio again.

I have no idea what the cause was, I just reran my code after renaming the new summarize_chisq to _chisq2 and it started working. I had already done the same before restarting RStudio, but now it works.

spgarbet commented 3 years ago

There is some issues I've been seeing recently in that dply has added some routines that collide with the names in tangram. I think the issue is the package load order. If dplyr is loaded after tangram then this occurs. Someone else reported this in an email and I cross referenced and it's the same error. I think I have a way to prevent it going forward.

I'm going to leave this ticket open as a reminder to find a way to prevent this issue.

bogie commented 3 years ago

That makes sense, looking at the code I did change the package loading! I wrote a small package loading function which installs packages, granted I use lapply so I am not entirely sure if there is some parallelisation and if the order of packages is the same, but another notable difference is loading tidyverse rather than dplyr!

Additionally I checked the transform again, "lancet" shows tangram as it's environment, while the overloaded "tf" transform naturally isnt run in the tangram environment. Is it possible to make tangram run it in its own env regardless if it is overloaded?

loadPackages <- function(package = NULL, packageList = NULL, silent = TRUE) {
    inst_pkgs <- installed.packages()
    if(!is.null(packageList)) {
        ret <- lapply(packageList, function(pkg) {
            if(!pkg %in% inst_pkgs){
                print(paste("Installing package: ",pkg))
                install.packages(pkg)
            }

            if(silent)
                suppressPackageStartupMessages(require(pkg,character.only = TRUE, quietly = T))
            else
                require(pkg, character.only=TRUE)
        })
    } else if(!is.null(package)) {
        if(!package %in% inst_pkgs) {
            print(paste("Installing package: ", package))
            install.packages(pkg)
        }
        if(silent)
            suppressPackageStartupMessages(require(package,character.only = TRUE, quietly = T))
        else
            require(package, character.only=TRUE)
    }
}

# This fails
loadPackages(packageList = c("shiny","dplyr","lubridate","shinythemes",
                              "stringr","tangram","Hmisc","kableExtra", "lubridate","plotly","tidyverse"))

# this works
library(lubridate)
library(ggplot2)
library(qwraps2)
library(zoo)
library(tidyverse)
library(survminer)
library(survival)
library(knitr)
library(kableExtra)
library(tangram)
library(Hmisc)
library(svglite)
library(rms)
library(forestplot)
library(forestmodel)
library(rpart)
library(rpart.plot)
library(plotly)
library(Gmisc)
library(grid)
library(DiagrammeR)
library(broom)
spgarbet commented 3 years ago

It's not easy to muck about with loaded environments without causing more side effects. I've added a warning with a suggestion to help the user identify this issue in the future.