MarioniLab / DropletUtils

Clone of the Bioconductor repository for the DropletUtils package.
https://bioconductor.org/packages/devel/bioc/html/DropletUtils.html
56 stars 27 forks source link

read10xCounts error (help troubleshooting) #97

Closed france-hub closed 1 year ago

france-hub commented 1 year ago

Hello, Thanks for your package. Till now I have analyzed my data using DropletUtils v 1.14.2. Recently, I changed laptop and installed the newest release. Starting from the same raw data I used before, I now get an error after running read10xCounts.

My script:

# Load raw counts x126
dirs <- list.dirs(".", recursive = FALSE, full.names = TRUE)
dirs
names(dirs) <- basename(dirs)
dirs.x126 <- dirs[grepl("x126", dirs)]

#read10xCounts 
sce.x126 <- read10xCounts(dirs.x126)
any(is.na(counts(sce.x126)))
TRUE

I reopened the old laptop, run the same script and

any(is.na(counts(sce.x126)))
FALSE

Then, with the new version, when I filter I get:

sce.x126<- sce.x126[rowSums(counts(sce.x126) > 0) > 0, ]
Error: logical subscript contains NAs

With the newer version, after read10xCounts I also get a Warning:

Warning messages:
1: In scan(file, nmax = nz, quiet = TRUE, what = list(i = integer(),  :
  number of items read is not a multiple of the number of columns
2: In scan(file, nmax = nz, quiet = TRUE, what = list(i = integer(),  :
  embedded nul(s) found in input
3: readMM(): expected 21068824 entries but found only 15654217 

Would you mind helping me understand where is my mistake? Thanks

> sessionInfo()
R version 4.2.2 (2022-10-31)
Platform: aarch64-apple-darwin22.1.0 (64-bit)
Running under: macOS Ventura 13.0

Matrix products: default
LAPACK: /opt/homebrew/Cellar/r/4.2.2/lib/R/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] grid      stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] BiocManager_1.30.19         SeuratDisk_0.0.0.9020       purrr_1.0.0                 scry_1.10.0                
 [5] tradeSeq_1.12.0             slingshot_2.6.0             TrajectoryUtils_1.6.0       princurve_2.1.6            
 [9] patchwork_1.1.2             stringr_1.5.0               magrittr_2.0.3              ggpubr_0.5.0               
[13] RColorBrewer_1.1-3          ComplexHeatmap_2.14.0       dplyr_1.0.10                scds_1.14.0                
[17] scater_1.26.1               scuttle_1.8.3               readxl_1.4.1                Matrix_1.5-3               
[21] LSD_4.1-0                   ggplot2_3.4.0               DropletUtils_1.19.2         cowplot_1.1.1              
[25] rstudioapi_0.14             SingleCellExperiment_1.20.0 SummarizedExperiment_1.28.0 Biobase_2.58.0             
[29] GenomicRanges_1.50.2        GenomeInfoDb_1.34.6         IRanges_2.32.0              S4Vectors_0.36.1           
[33] BiocGenerics_0.44.0         MatrixGenerics_1.10.0       matrixStats_0.63.0         

loaded via a namespace (and not attached):
  [1] scattermore_0.8           R.methodsS3_1.8.2         SeuratObject_4.1.3        tidyr_1.2.1              
  [5] bit64_4.0.5               irlba_2.3.5.1             DelayedArray_0.24.0       R.utils_2.12.2           
  [9] data.table_1.14.6         RCurl_1.98-1.9            doParallel_1.0.17         generics_0.1.3           
 [13] ScaledMatrix_1.6.0        callr_3.7.3               usethis_2.1.6             RANN_2.6.1               
 [17] future_1.30.0             bit_4.0.5                 spatstat.data_3.0-0       httpuv_1.6.7             
 [21] assertthat_0.2.1          viridis_0.6.2             promises_1.2.0.1          fansi_1.0.3              
 [25] igraph_1.3.5              DBI_1.1.3                 htmlwidgets_1.6.0         spatstat.geom_3.0-3      
 [29] ellipsis_0.3.2            backports_1.4.1           deldir_1.0-6              sparseMatrixStats_1.10.0 
 [33] vctrs_0.5.1               remotes_2.4.2             ROCR_1.0-11               abind_1.4-5              
 [37] cachem_1.0.6              withr_2.5.0               progressr_0.12.0          sctransform_0.3.5        
 [41] prettyunits_1.1.1         goftest_1.2-3             cluster_2.1.4             lazyeval_0.2.2           
 [45] crayon_1.5.2              arrow_10.0.1              hdf5r_1.3.7               spatstat.explore_3.0-5   
 [49] edgeR_3.40.1              pkgconfig_2.0.3           nlme_3.1-161              vipor_0.4.5              
 [53] pkgload_1.3.2             devtools_2.4.5            rlang_1.0.6               globals_0.16.2           
 [57] lifecycle_1.0.3           miniUI_0.1.1.1            rsvd_1.0.5                cellranger_1.1.0         
 [61] polyclip_1.10-4           lmtest_0.9-40             carData_3.0-5             Rhdf5lib_1.20.0          
 [65] zoo_1.8-11                beeswarm_0.4.0            ggridges_0.5.4            GlobalOptions_0.1.2      
 [69] processx_3.8.0            png_0.1-8                 viridisLite_0.4.1         rjson_0.2.21             
 [73] bitops_1.0-7              R.oo_1.25.0               KernSmooth_2.23-20        rhdf5filters_1.10.0      
 [77] pROC_1.18.0               DelayedMatrixStats_1.20.0 shape_1.4.6               parallelly_1.33.0        
 [81] spatstat.random_3.0-1     rstatix_0.7.1             ggsignif_0.6.4            beachmat_2.14.0          
 [85] scales_1.2.1              memoise_2.0.1             plyr_1.8.8                ica_1.0-3                
 [89] zlibbioc_1.44.0           compiler_4.2.2            dqrng_0.3.0               clue_0.3-63              
 [93] fitdistrplus_1.1-8        cli_3.5.0                 XVector_0.38.0            urlchecker_1.0.1         
 [97] listenv_0.9.0             pbapply_1.6-0             ps_1.7.2                  MASS_7.3-58.1            
[101] mgcv_1.8-41               tidyselect_1.2.0          stringi_1.7.8             BiocSingular_1.14.0      
[105] locfit_1.5-9.7            ggrepel_0.9.2             tools_4.2.2               future.apply_1.10.0      
[109] parallel_4.2.2            circlize_0.4.15           foreach_1.5.2             gridExtra_2.3            
[113] Rtsne_0.16                digest_0.6.31             shiny_1.7.4               Rcpp_1.0.9               
[117] car_3.1-1                 broom_1.0.2               later_1.3.0               RcppAnnoy_0.0.20         
[121] httr_1.4.4                colorspace_2.0-3          fs_1.5.2                  tensor_1.5               
[125] reticulate_1.26           splines_4.2.2             uwot_0.1.14               spatstat.utils_3.0-1     
[129] sp_1.5-1                  xgboost_1.6.0.1           plotly_4.10.1             sessioninfo_1.2.2        
[133] xtable_1.8-4              jsonlite_1.8.4            R6_2.5.1                  profvis_0.3.7            
[137] pillar_1.8.1              htmltools_0.5.4           mime_0.12                 glue_1.6.2               
[141] fastmap_1.1.0             BiocParallel_1.32.5       BiocNeighbors_1.16.0      codetools_0.2-18         
[145] pkgbuild_1.4.0            utf8_1.2.2                lattice_0.20-45           spatstat.sparse_3.0-0    
[149] tibble_3.1.8              curl_4.3.3                ggbeeswarm_0.7.1          leiden_0.4.3             
[153] survival_3.4-0            limma_3.54.0              munsell_0.5.0             GetoptLong_1.0.5         
[157] rhdf5_2.42.0              GenomeInfoDbData_1.2.9    iterators_1.0.14          HDF5Array_1.26.0         
[161] reshape2_1.4.4            gtable_0.3.1              Seurat_4.3.0   
LTLA commented 1 year ago

I would guess that the files on your new laptop are corrupted. The warning messages from Matrix::readMM are pretty suggestive; the MatrixMarket file itself has a record of the expected number of lines in the file, so the fact that it's not self-consistent indicates that something is wrong with the file.

france-hub commented 1 year ago

Oh alright. Thank you for the clear explanation