satijalab / sctransform

R package for modeling single cell UMI expression data using regularized negative binomial regression
GNU General Public License v3.0
206 stars 33 forks source link

Error in while ((it <- it + 1) < limit && abs(del) > eps) { : missing value where TRUE/FALSE needed #65

Closed morganee261 closed 3 years ago

morganee261 commented 4 years ago

Hello, I am trying to run SCTransform on a subset of my seurat object and here is the error I get : seuratobj <- SCTransform(seuratobj, vars.to.regress = "percent.mt", verbose = TRUE)

Calculating cell attributes for input UMI matrix Variance stabilizing transformation of count matrix of size 22846 by 73390 Model formula is y ~ log_umi Get Negative Binomial regression parameters per gene Using 2000 genes, 73390 cells | | 0%Error in while ((it <- it + 1) < limit && abs(del) > eps) { : missing value where TRUE/FALSE needed In addition: There were 50 or more warnings (use warnings() to see the first 50)

could you please let me know what I am doing wrong ? thanks for your help and time

morgane

ChristophH commented 4 years ago

What is the output of traceback() right after the error is triggered? Please provide a minimal working example that produces the error, so I can have a look at it.

TJCooperIL commented 3 years ago

I'm also experiencing this error (on a Seurat object). Subsetted Seurat object which triggers the error: https://u.pcloud.link/publink/show?code=XZS8XFXZuAJLR5wdLBSW3Ok7JEnYQJ2s1Ck7 Command: sample <- SCTransform(sample, verbose=TRUE) Error message:

Calculating cell attributes from input UMI matrix: log_umi
Variance stabilizing transformation of count matrix of size 19582 by 500
Model formula is y ~ log_umi
Get Negative Binomial regression parameters per gene
Using 2000 genes, 500 cells
  |                                                                                                                                                                           |   0%Error in while ((it <- it + 1) < limit && abs(del) > eps) { : 
  missing value where TRUE/FALSE needed
In addition: There were 50 or more warnings (use warnings() to see the first 50)

Note: I've not encountered this bug before attempting to use it with denoised data (where the expression matrix no longer contains integer values).

sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.1 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
 [1] LC_CTYPE=en_IL.UTF-8       LC_NUMERIC=C               LC_TIME=en_IL.UTF-8        LC_COLLATE=en_IL.UTF-8     LC_MONETARY=en_IL.UTF-8    LC_MESSAGES=en_IL.UTF-8   
 [7] LC_PAPER=en_IL.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_IL.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
 [1] splines   parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] velocyto.R_0.6              Matrix_1.2-18               SeuratWrappers_0.3.0        wesanderson_0.3.6           tradeSeq_1.2.01             tidyr_1.1.2                
 [7] stringr_1.4.0               slingshot_1.6.1             princurve_2.1.5             reshape2_1.4.4              pheatmap_1.0.12             patchwork_1.0.1            
[13] org.Mm.eg.db_3.11.4         AnnotationDbi_1.50.3        mixtools_1.2.0              harmony_1.0                 Rcpp_1.0.5                  ggsci_2.9                  
[19] ggpubr_0.4.0                ggplotify_0.0.5             gam_1.20                    foreach_1.5.0               dplyr_1.0.2                 destiny_3.2.0              
[25] clustree_0.4.3              ggraph_2.0.3                ggplot2_3.3.2               clusterProfiler_3.16.1      clusterExperiment_2.8.0     UpSetR_1.4.0               
[31] SingleR_1.2.4               SingleCellExperiment_1.10.1 SummarizedExperiment_1.18.2 DelayedArray_0.14.1         matrixStats_0.57.0          Biobase_2.48.0             
[37] GenomicRanges_1.40.0        GenomeInfoDb_1.24.2         IRanges_2.22.2              S4Vectors_0.26.1            BiocGenerics_0.34.0         Seurat_3.2.2               
[43] ReactomePA_1.32.0           RColorBrewer_1.1-2          GSVA_1.36.2                

loaded via a namespace (and not attached):
  [1] rsvd_1.0.3                    vcd_1.4-8                     ica_1.0-2                     zinbwave_1.10.1               class_7.3-17                 
  [6] lmtest_0.9-38                 crayon_1.3.4                  laeken_0.5.1                  MASS_7.3-53                   nlme_3.1-149                 
 [11] backports_1.1.10              qlcMatrix_0.9.7               GOSemSim_2.14.2               rlang_0.4.7                   XVector_0.28.0               
 [16] ROCR_1.0-11                   readxl_1.3.1                  irlba_2.3.3                   limma_3.44.3                  phylobase_0.8.10             
 [21] smoother_1.1                  BiocParallel_1.22.0           bit64_4.0.5                   glue_1.4.2                    rngtools_1.5                 
 [26] sctransform_0.3               VGAM_1.1-3                    DOSE_3.14.0                   haven_2.3.1                   tidyselect_1.1.0             
 [31] rio_0.5.16                    fitdistrplus_1.1-1            XML_3.99-0.5                  zoo_1.8-8                     xtable_1.8-4                 
 [36] RcppHNSW_0.3.0                magrittr_1.5                  bibtex_0.4.2.3                zlibbioc_1.34.0               rstudioapi_0.11              
 [41] miniUI_0.1.1.1                sp_1.4-2                      rpart_4.1-15                  fastmatch_1.1-0               locfdr_1.1-8                 
 [46] RcppEigen_0.3.3.7.0           tinytex_0.26                  shiny_1.5.0                   BiocSingular_1.4.0            xfun_0.18                    
 [51] cluster_2.1.0                 tidygraph_1.2.0               pcaMethods_1.80.0             tibble_3.0.3                  interactiveDisplayBase_1.26.3
 [56] ggrepel_0.8.2                 ape_5.4-1                     listenv_0.8.0                 png_0.1-7                     future_1.19.1                
 [61] withr_2.3.0                   slam_0.1-47                   bitops_1.0-6                  ggforce_0.3.2                 ranger_0.12.1                
 [66] plyr_1.8.6                    cellranger_1.1.0              GSEABase_1.50.1               sparsesvd_0.2                 e1071_1.7-3                  
 [71] pillar_1.4.6                  TTR_0.24.2                    scatterplot3d_0.3-41          kernlab_0.9-29                graphite_1.34.0              
 [76] europepmc_0.4                 xts_0.12.1                    DelayedMatrixStats_1.10.1     vctrs_0.3.4                   ellipsis_0.3.1               
 [81] generics_0.0.2                urltools_1.7.3                NMF_0.23.0                    tools_4.0.2                   foreign_0.8-79               
 [86] rncl_0.8.4                    munsell_0.5.0                 tweenr_1.0.1                  proxy_0.4-24                  fgsea_1.14.0                 
 [91] HSMMSingleCell_1.8.0          fastmap_1.0.1                 compiler_4.0.2                abind_1.4-5                   httpuv_1.5.4                 
 [96] segmented_1.2-0               ExperimentHub_1.14.2          pkgmaker_0.31.2               plotly_4.9.2.1                GenomeInfoDbData_1.2.3       
[101] gridExtra_2.3                 edgeR_3.30.3                  lattice_0.20-41               deldir_0.1-29                 later_1.1.0.1                
[106] BiocFileCache_1.12.1          jsonlite_1.7.1                ggplot.multistats_1.0.0       scales_1.1.1                  docopt_0.7.1                 
[111] graph_1.66.0                  pbapply_1.4-3                 carData_3.0-4                 genefilter_1.70.0             lazyeval_0.2.2               
[116] promises_1.1.1                spatstat_1.64-1               car_3.0-10                    doParallel_1.0.15             goftest_1.2-2                
[121] spatstat.utils_1.17-0         reticulate_1.16               checkmate_2.0.0               openxlsx_4.2.2                cowplot_1.1.0                
[126] Rtsne_0.15                    forcats_0.5.0                 downloader_0.4                softImpute_1.4                uwot_0.1.8                   
[131] igraph_1.2.5                  HDF5Array_1.16.1              survival_3.2-7                yaml_2.2.1                    DDRTree_0.1.5                
[136] htmltools_0.5.0               memoise_1.1.0                 locfit_1.5-9.4                graphlayouts_0.7.0            viridisLite_0.3.0            
[141] digest_0.6.25                 assertthat_0.2.1              mime_0.9                      rappdirs_0.3.1                densityClust_0.3             
[146] registry_0.5-1                RSQLite_2.2.1                 future.apply_1.6.0            remotes_2.2.0                 data.table_1.13.0            
[151] blob_1.2.1                    RNeXML_2.4.5                  labeling_0.3                  fastICA_1.2-2                 shinythemes_1.1.2            
[156] Rhdf5lib_1.10.1               AnnotationHub_2.20.2          RCurl_1.98-1.2                monocle_2.16.0                broom_0.7.1                  
[161] hms_0.5.3                     rhdf5_2.32.3                  colorspace_1.4-1              BiocManager_1.30.10           nnet_7.3-14                  
[166] RANN_2.6.1                    enrichplot_1.8.1              VIM_6.0.0                     R6_2.4.1                      grid_4.0.2                   
[171] ggridges_0.5.2                lifecycle_0.2.0               zip_2.1.1                     ggsignif_0.6.0                curl_4.3                     
[176] leiden_0.3.3                  robustbase_0.93-6             DO.db_2.9                     howmany_0.3-1                 qvalue_2.20.0                
[181] RcppAnnoy_0.0.16              iterators_1.0.12              htmlwidgets_1.5.2             polyclip_1.10-0               triebeard_0.3.0              
[186] purrr_0.3.4                   gridGraphics_0.5-0            reactome.db_1.70.0            mgcv_1.8-33                   globals_0.13.0               
[191] codetools_0.2-16              FNN_1.1.3                     GO.db_3.11.4                  prettyunits_1.1.1             dbplyr_1.4.4                 
[196] gridBase_0.4-7                RSpectra_0.16-0               gtable_0.3.0                  DBI_1.1.0                     tensor_1.5                   
[201] httr_1.4.2                    KernSmooth_2.23-17            stringi_1.5.3                 progress_1.2.2                farver_2.0.3                 
[206] uuid_0.1-4                    ggthemes_4.2.0                annotate_1.66.0               viridis_0.5.1                 hexbin_1.28.1                
[211] combinat_0.0-8                xml2_1.3.2                    rvcheck_0.1.8                 boot_1.3-25                   BiocNeighbors_1.6.0          
[216] ade4_1.7-15                   BiocVersion_3.11.1            DEoptimR_1.0-8                bit_4.0.4                     scatterpie_0.1.5             
[221] spatstat.data_1.4-3           pkgconfig_2.0.3               rstatix_0.6.0
ChristophH commented 3 years ago

The problem seems to be that some lowly detected genes cause errors during the model fitting step. sctransform::vst() assumes integer values and removes genes that are not detected in at least 5 cells (default value of min_cells parameter). Any count larger than 0 is a detection event. If I apply a more stringent filtering that also removes genes that do not add up to at least 5, vst() works.

> library('Matrix')
> library('Seurat')
> library('sctransform')
> 
> s <- readRDS('~/Downloads/SCTransform_Error.rds')
> 
> cm <- s$RNA@counts
> dim(cm)
[1] 19582   500
> cm <- cm[rowSums(cm>0) >= 5 & rowSums(cm) >= 5, ]
> dim(cm)
[1] 10618   500
> vst_out <- vst(umi=cm)
Calculating cell attributes from input UMI matrix: log_umi
Variance stabilizing transformation of count matrix of size 10618 by 500
Model formula is y ~ log_umi
Get Negative Binomial regression parameters per gene
Using 2000 genes, 500 cells
  |==============================================================================| 100%
There are 7 estimated thetas smaller than 1e-07 - will be set to 1e-07
Found 141 outliers - those will be ignored in fitting/regularization step

Second step: Get residuals using fitted parameters for 10618 genes
  |==============================================================================| 100%
Calculating gene attributes
Wall clock passed: Time difference of 8.079988 secs
There were 50 or more warnings (use warnings() to see the first 50)

plot_model_pars(vst_out) image

But why would you re-normalize de-noised data? I doubt that is appropriate.

If you want to smooth the data, you can do so after fitting the regularized models as shown in this vignette

ktrns commented 3 years ago

Hi.

I am getting this error, which seems related:

Calculating cell attributes from input UMI matrix: log_umi
Variance stabilizing transformation of count matrix of size 7033 by 150
Model formula is y ~ log_umi
Get Negative Binomial regression parameters per gene
Using 2000 genes, 150 cells
  |                                                                                                                                                     |   0%Error in while ((it <- it + 1) < limit && abs(del) > eps) { : 
  missing value where TRUE/FALSE needed

I have successfully been running the code for this very dataset before, and it worked, and comparing the sctransform packages, I went from 0.20 to 0.31. Any idea?

This is a SmartSeq-2 dataset, and I know that sctransform is rather for umi-based data. Still wondering, as it seemed to technically run before.

Best and many thanks in advance Katrin

ChristophH commented 3 years ago

Hi Katrin, this is likely related to a bug that is triggered by some SmartSeq2 data. Could you try the develop branch? There should be a fix for the problem.

remotes::install_github("ChristophH/sctransform@develop")

ktrns commented 3 years ago

Am I doing something wrong?

> remotes::install_github("ChristophH/sctransform@vdevelop")
Downloading GitHub repo ChristophH/sctransform@vdevelop
Error in utils::download.file(url, path, method = method, quiet = quiet,  : 
  cannot open URL 'https://api.github.com/repos/ChristophH/sctransform/tarball/vdevelop'

Thanks for your quick response Katrin

ChristophH commented 3 years ago

Sorry, my initial reply had a typo. The command is remotes::install_github("ChristophH/sctransform@develop")

morganee261 commented 3 years ago

I reinstalled the package and it is working fine now. thanks

nasjr08 commented 3 years ago

Hi Christoph,

I have recently started analysing my sc data and I got this error. I have been following a non-Seurat workflow where instead I use the SingleCellExperiment package to create an sce object (SingleCellExperiment() function) and did some filtering including low library size ,cells, low feature count and high mito percentage. ... I tried using the vst() function for variance stabilizing transformation on the data and got the above error.

The traceback output was:

11. stop(condition) 10. signalConditions(obj, exclude = getOption("future.relay.immediate", "immediateCondition"), resignal = resignal, ...) 9. signalConditionsASAP(obj, resignal = FALSE, pos = ii) 8. resolve.list(y, result = TRUE, stdout = stdout, signal = signal, force = TRUE) 7. resolve(y, result = TRUE, stdout = stdout, signal = signal, force = TRUE) 6. value.list(fs) 5. value(fs) 4. future_xapply(FUN = FUN, nX = nX, chunk_args = X, args = list(...), get_chunk = [, expr = expr, envir = envir, future.globals = future.globals, future.packages = future.packages, future.scheduling = future.scheduling, future.chunk.size = future.chunk.size, future.stdout = future.stdout, ... 3. future_lapply(X = index_lst, FUN = function(indices) { umi_bin_worker <- umi_bin[indices, , drop = FALSE] if (method == "poisson") { return(fit_poisson(umi = umi_bin_worker, model_str = model_str, ... 2. get_model_pars(genes_step1, bin_size, umi, model_str, cells_step1, method, data_step1, theta_given, theta_estimation_fun, verbosity) 1. sctransform::vst(counts, latent_var = c("log_umi"), return_gene_attr = TRUE, return_cell_attr = TRUE, verbosity = 0)

This was performed on the raw counts prior to any other normalization.

ktrns commented 3 years ago

Sorry, my initial reply had a typo. The command is remotes::install_github("ChristophH/sctransform@develop")

I finally got to it again. I installed the development version, and it seems working again. Thanks for your help.