immunogenomics / harmony

Fast, sensitive and accurate integration of single-cell data with Harmony
https://portals.broadinstitute.org/harmony/
Other
525 stars 100 forks source link

Error in `dplyr::select()`: ! <text>:1:5: unexpected symbol #173

Open lucygarner opened 2 years ago

lucygarner commented 2 years ago

Hi,

I am getting the following error when I run RunHarmony on my Seurat object and I can't work out where this dplyr::select() is coming from.

seurat_object <- RunHarmony(seurat_object, group.by.vars = "donor")

Error in dplyr::select(): ! \<text>:1:5: unexpected symbol 1: Use of                ^ Run rlang::last_error() to see where the error occurred.

rlang::last_error()

<simpleError in dplyr::select(., -.data$row_id): :1:5: unexpected symbol 1: Use of               ^

Best wishes, Lucy

sessionInfo()

R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS:   /Filers/package/R-base/4.2.0/lib64/R/lib/libRblas.so
LAPACK: /Filers/package/R-base/4.2.0/lib64/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods  
[7] base     

other attached packages:
 [1] harmony_0.1.0      Rcpp_1.0.9         patchwork_1.1.2   
 [4] forcats_0.5.2      stringr_1.4.1      dplyr_1.0.10      
 [7] purrr_0.3.5        readr_2.1.3        tidyr_1.2.1       
[10] tibble_3.1.8       ggplot2_3.3.6      tidyverse_1.3.2   
[13] sp_1.5-0           SeuratObject_4.1.2 Seurat_4.2.0      

loaded via a namespace (and not attached):
  [1] readxl_1.4.1          backports_1.4.1       plyr_1.8.7           
  [4] igraph_1.3.5          lazyeval_0.2.2        splines_4.2.0        
  [7] listenv_0.8.0         scattermore_0.8       digest_0.6.29        
 [10] htmltools_0.5.3       fansi_1.0.3           magrittr_2.0.3       
 [13] tensor_1.5            googlesheets4_1.0.1   cluster_2.1.4        
 [16] ROCR_1.0-11           tzdb_0.3.0            globals_0.16.1       
 [19] modelr_0.1.9          matrixStats_0.62.0    spatstat.sparse_2.1-1
 [22] colorspace_2.0-3      rvest_1.0.3           ggrepel_0.9.1        
 [25] haven_2.5.1           xfun_0.30             crayon_1.5.2         
 [28] jsonlite_1.8.2        progressr_0.11.0      spatstat.data_2.2-0  
 [31] survival_3.4-0        zoo_1.8-11            glue_1.6.2           
 [34] polyclip_1.10-0       gtable_0.3.1          gargle_1.2.1         
 [37] leiden_0.4.3          future.apply_1.9.1    abind_1.4-5          
 [40] scales_1.2.1          DBI_1.1.3             spatstat.random_2.2-0
 [43] miniUI_0.1.1.1        viridisLite_0.4.1     xtable_1.8-4         
 [46] reticulate_1.26       spatstat.core_2.4-4   htmlwidgets_1.5.4    
 [49] httr_1.4.4            RColorBrewer_1.1-3    ellipsis_0.3.2       
 [52] ica_1.0-3             pkgconfig_2.0.3       farver_2.1.1         
 [55] uwot_0.1.14           dbplyr_2.2.1          deldir_1.0-6         
 [58] utf8_1.2.2            tidyselect_1.2.0      labeling_0.4.2       
 [61] rlang_1.0.5           reshape2_1.4.4        later_1.3.0          
 [64] munsell_0.5.0         cellranger_1.1.0      tools_4.2.0          
 [67] cli_3.4.1             generics_0.1.3        broom_1.0.1          
 [70] ggridges_0.5.4        evaluate_0.17         fastmap_1.1.0        
 [73] yaml_2.3.5            goftest_1.2-3         knitr_1.40           
 [76] fs_1.5.2              fitdistrplus_1.1-8    RANN_2.6.1           
 [79] pbapply_1.5-0         future_1.28.0         nlme_3.1-160         
 [82] mime_0.12             ggrastr_1.0.1         xml2_1.3.3           
 [85] compiler_4.2.0        rstudioapi_0.14       beeswarm_0.4.0       
 [88] plotly_4.10.0         png_0.1-7             spatstat.utils_2.3-1 
 [91] reprex_2.0.2          stringi_1.7.8         rgeos_0.5-9          
 [94] lattice_0.20-45       Matrix_1.5-1          vctrs_0.4.2          
 [97] pillar_1.8.1          lifecycle_1.0.1       spatstat.geom_2.4-0  
[100] lmtest_0.9-40         RcppAnnoy_0.0.19      data.table_1.14.2    
[103] cowplot_1.1.1         irlba_2.3.5.1         httpuv_1.6.6         
[106] R6_2.5.1              promises_1.2.0.1      KernSmooth_2.23-20   
[109] gridExtra_2.3         vipor_0.4.5           parallelly_1.32.1    
[112] codetools_0.2-18      MASS_7.3-58.1         assertthat_0.2.1     
[115] withr_2.5.0           sctransform_0.3.5     mgcv_1.8-40          
[118] parallel_4.2.0        hms_1.1.2             grid_4.2.0           
[121] rpart_4.1.16          rmarkdown_2.17        googledrive_2.0.0    
[124] Rtsne_0.16            shiny_1.7.2           lubridate_1.8.0      
[127] ggbeeswarm_0.6.0     
lucygarner commented 2 years ago

This error also occurs if I extract the embeddings and metavars_df and run HarmonyMatrix.

I have traced the error to the following part of HarmonyMatrix.

 phi <- Reduce(rbind, lapply(vars_use, function(var_use) {
        t(onehot(meta_data[[var_use]]))
    }))

Within this, it is the onehot function that is causing the error.

harmony:::onehot

function (x) 
{
    data.frame(x) %>% tibble::rowid_to_column("row_id") %>% dplyr::mutate(dummy = 1) %>% 
        tidyr::spread(x, .data$dummy, fill = 0) %>% dplyr::select(-.data$row_id) %>% 
        as.matrix
}

Changing dplyr::select(-.data$row_id) to dplyr::select(-row_id) fixes the issue.

Why it is working on my MacBook and not on CentOS 7 I can't explain since they are using the same version of dplyr.

I am also confused by the factor that the onehot function I find in harmony/utils.R looks completely different:

onehot <- function(x) {
    res <- model.matrix(~0 + x)
    colnames(res) <- gsub('^x(.*)', '\\1', colnames(res))
    return(res)
}
Irenexzwen commented 2 years ago

I got the same error today. Using the following code before running RunHarmony solved my problem:

harmony.onehot.new <- function (x) 
{
  data.frame(x) %>% tibble::rowid_to_column("row_id") %>% dplyr::mutate(dummy = 1) %>% 
    tidyr::spread(x, .data$dummy, fill = 0) %>% dplyr::select(-row_id) %>% 
    as.matrix
}
environment(harmony.onehot.new) <- asNamespace('harmony')
assignInNamespace("onehot", harmony.onehot.new, ns = "harmony")
lucygarner commented 2 years ago

Thanks, I fixed it in a similar way.

lucygarner commented 2 years ago

Reprex:

library(tidyverse)

onehot <- function (x) {
    data.frame(x) %>% 
        tibble::rowid_to_column("row_id") %>% 
        dplyr::mutate(dummy = 1) %>% 
        tidyr::spread(x, .data$dummy, fill = 0) %>% 
        dplyr::select(-.data$row_id) %>% 
        as.matrix
}

donor <- factor(c("1", "2", "1", "3", "4", "2", "4", "1"))
onehot(donor)
#> Error in dplyr::select(., -.data$row_id): <text>:1:5: unexpected symbol
#> 1: Use of
#>         ^

sessionInfo()
#> R version 4.2.0 (2022-04-22)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: CentOS Linux 7 (Core)
#> 
#> Matrix products: default
#> BLAS:   /Filers/package/R-base/4.2.0/lib64/R/lib/libRblas.so
#> LAPACK: /Filers/package/R-base/4.2.0/lib64/R/lib/libRlapack.so
#> 
#> locale:
#>  [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
#>  [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
#>  [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] forcats_0.5.2   stringr_1.4.1   dplyr_1.0.10    purrr_0.3.5    
#> [5] readr_2.1.3     tidyr_1.2.1     tibble_3.1.8    ggplot2_3.3.6  
#> [9] tidyverse_1.3.2
#> 
#> loaded via a namespace (and not attached):
#>  [1] lubridate_1.8.0     assertthat_0.2.1    digest_0.6.29      
#>  [4] utf8_1.2.2          R6_2.5.1            cellranger_1.1.0   
#>  [7] backports_1.4.1     reprex_2.0.2        evaluate_0.17      
#> [10] httr_1.4.4          highr_0.9           pillar_1.8.1       
#> [13] rlang_1.0.5         googlesheets4_1.0.1 readxl_1.4.1       
#> [16] rstudioapi_0.14     R.utils_2.12.0      R.oo_1.25.0        
#> [19] rmarkdown_2.17      styler_1.7.0        googledrive_2.0.0  
#> [22] munsell_0.5.0       broom_1.0.1         compiler_4.2.0     
#> [25] modelr_0.1.9        xfun_0.30           pkgconfig_2.0.3    
#> [28] htmltools_0.5.3     tidyselect_1.2.0    fansi_1.0.3        
#> [31] crayon_1.5.2        tzdb_0.3.0          dbplyr_2.2.1       
#> [34] withr_2.5.0         R.methodsS3_1.8.2   grid_4.2.0         
#> [37] jsonlite_1.8.2      gtable_0.3.1        lifecycle_1.0.1    
#> [40] DBI_1.1.3           magrittr_2.0.3      scales_1.2.1       
#> [43] cli_3.4.1           stringi_1.7.8       fs_1.5.2           
#> [46] xml2_1.3.3          ellipsis_0.3.2      generics_0.1.3     
#> [49] vctrs_0.4.2         tools_4.2.0         R.cache_0.16.0     
#> [52] glue_1.6.2          hms_1.1.2           fastmap_1.1.0      
#> [55] yaml_2.3.5          colorspace_2.0-3    gargle_1.2.1       
#> [58] rvest_1.0.3         knitr_1.40          haven_2.5.1