Closed MwrulesC closed 6 years ago
This sounds more like an issue with scater's runTSNE
rather than mnnCorrect
, especially if you can successfully generate a PCA off the MNN-corrected values. Indeed, in the code above, you're not even running the t-SNE on the MNN-corrected values, but instead the "logcounts"
assay.
I'm a bit bemused about how it is possible to have no dimensions in the svd
function - that would suggest that you have no cells or no genes within the Rtsne
call (itself within runTSNE
). Off the top of my head, that would involve all-zero rows, or possibly a dataset full of duplicate cells.
I would suggest running debug(runTSNE)
, stepping through the function, and investigating the value of vals
that is being passed into the Rtsne
call. If this has zero dimensions, then it's a fault in runTSNE
; if it has non-zero dimensions, then it's a fault in Rtsne
.
In addition, here are some more general comments:
SingleCellExperiment
with the read10xCounts
function, no need for a manual conversion.@
to get or set data values in a S4 class. This is widely considered to be Bad Practice; use the appropriate getters/setters instead. For example, assay(blah)
and assay(blah) <- x
.sessionInfo()
. When using Bioconductor packages, make sure you're using the latest version (BioC 3.7).I was actually running on the mnnCorrect, but I think since it was giving me an error I didn't add that command to my script that I was copying from.
Thanks for the information, I'll look more into what I get when I try running debug and seeing what my output looks like.
Also thanks for the general comments, I do have a tendency to just try to hack together something that works. I'll also look more into DropletUtils, I had found it earlier but it wasn't installed on the server I was using and so I hadn't tried incorporating it yet.
In case it is still useful, here is what I have for my sessionInfo output: R version 3.4.3 (2017-11-30) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: macOS Sierra 10.12.6
Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] scater_1.6.3 scran_1.6.9 SingleCellExperiment_1.0.0
[4] SummarizedExperiment_1.8.1 DelayedArray_0.4.1 matrixStats_0.53.1
[7] Biobase_2.38.0 GenomicRanges_1.30.3 GenomeInfoDb_1.14.0
[10] IRanges_2.12.0 S4Vectors_0.16.0 BiocGenerics_0.24.0
[13] BiocParallel_1.12.0 Seurat_2.2.1 Matrix_1.2-12
[16] cowplot_0.9.2 ggplot2_2.2.1
loaded via a namespace (and not attached):
[1] shinydashboard_0.6.1 R.utils_2.6.0 tidyselect_0.2.4 RSQLite_2.0
[5] AnnotationDbi_1.40.0 htmlwidgets_1.0 grid_3.4.3 trimcluster_0.1-2
[9] ranger_0.9.0 Rtsne_0.13 devtools_1.13.5 munsell_0.4.3
[13] codetools_0.2-15 ica_1.0-1 DT_0.4 statmod_1.4.30
[17] withr_2.1.2 colorspace_1.3-2 knitr_1.20 rstudioapi_0.7
[21] ROCR_1.0-7 robustbase_0.92-8 dtw_1.18-1 dimRed_0.1.0
[25] lars_1.2 tximport_1.7.14 GenomeInfoDbData_1.0.0 mnormt_1.5-5
[29] bit64_0.9-7 rhdf5_2.22.0 ipred_0.9-6 diptest_0.75-7
[33] R6_2.2.2 ggbeeswarm_0.6.0 VGAM_1.0-5 locfit_1.5-9.1
[37] flexmix_2.3-14 DRR_0.0.3 bitops_1.0-6 assertthat_0.2.0
[41] SDMTools_1.1-221 scales_0.5.0 nnet_7.3-12 beeswarm_0.2.3
[45] gtable_0.2.0 ddalpha_1.3.1.1 timeDate_3043.102 rlang_0.2.0
[49] CVST_0.2-1 scatterplot3d_0.3-41 RcppRoll_0.2.2 splines_3.4.3
[53] lazyeval_0.2.1 ModelMetrics_1.1.0 acepack_1.4.1 broom_0.4.3
[57] checkmate_1.8.5 yaml_2.1.18 reshape2_1.4.3 backports_1.1.2
[61] httpuv_1.3.6.2 Hmisc_4.1-1 caret_6.0-78 tools_3.4.3
[65] lava_1.6 psych_1.7.8 gplots_3.0.1 RColorBrewer_1.1-2
[69] proxy_0.4-21 dynamicTreeCut_1.63-1 ggridges_0.4.1 Rcpp_0.12.16
[73] plyr_1.8.4 base64enc_0.1-3 progress_1.1.2 zlibbioc_1.24.0
[77] purrr_0.2.4 RCurl_1.95-4.10 prettyunits_1.0.2 rpart_4.1-13
[81] pbapply_1.3-4 viridis_0.5.0 zoo_1.8-1 sfsmisc_1.1-2
[85] cluster_2.0.6 magrittr_1.5 data.table_1.10.4-3 mvtnorm_1.0-7
[89] mime_0.5 xtable_1.8-2 XML_3.98-1.10 mclust_5.4
[93] gridExtra_2.3 compiler_3.4.3 biomaRt_2.34.2 tibble_1.4.2
[97] KernSmooth_2.23-15 R.oo_1.21.0 htmltools_0.3.6 segmented_0.5-3.0
[101] Formula_1.2-2 tidyr_0.8.0 tclust_1.3-1 lubridate_1.7.3
[105] DBI_0.8 diffusionMap_1.1-0 MASS_7.3-49 fpc_2.1-11
[109] R.methodsS3_1.7.1 gdata_2.18.0 metap_0.8 bindr_0.1.1
[113] gower_0.1.2 igraph_1.2.1 pkgconfig_2.0.1 sn_1.5-1
[117] numDeriv_2016.8-1 foreign_0.8-69 recipes_0.1.2 foreach_1.4.4
[121] vipor_0.4.5 XVector_0.18.0 prodlim_1.6.1 stringr_1.3.0
[125] digest_0.6.15 tsne_0.1-3 htmlTable_1.11.2 edgeR_3.20.9
[129] kernlab_0.9-25 shiny_1.0.5 gtools_3.5.0 modeltools_0.2-21
[133] rjson_0.2.15 nlme_3.1-131.1 bindrcpp_0.2 viridisLite_0.3.0
[137] limma_3.34.9 pillar_1.2.1 lattice_0.20-35 httr_1.3.1
[141] DEoptimR_1.0-8 survival_2.41-3 glue_1.2.0 FNN_1.1
[145] prabclus_2.2-6 iterators_1.0.9 bit_1.1-12 class_7.3-14
[149] stringi_1.1.7 mixtools_1.1.0 blob_1.1.0 latticeExtra_0.6-28
[153] caTools_1.17.1 memoise_1.1.0 dplyr_0.7.4 irlba_2.3.2
[157] ape_5.0
The Bioconductor packages you're using are from BioC 3.6. The latest version of Bioconductor is 3.7, running on R 3.5.0. I suggest you try upgrading R and the associated packages, to see whether the behaviour persists. Indeed, even if there is a bug in one of my packages, I will only be able to implement a bug fix to the latest release version, so you would have to upgrade to take advantage of it anyway.
After installing R 3.5.0 and the associated packages, it seems like it has fixed the problem. Thanks for all your help.
An unrelated question, but I was just wondering. Let's say I have the situation where I have 3 samples. Sample 1 is composed of a cells from a normal cell line, sample 2 is composed of cells from a tumor cell line, and sample 3 is composed of cells from a mixture of the two. Is there a way for me to apply mnnCorrect if I observe batch effects?
Thanks!
Yes, that's the whole raison d'etre of mnnCorrect
. As long as there are shared populations across batches, then the correction can be performed. However, the order matters as it determines how mutual nearest neighbours are identified across batches. Samples 1 and 2 do not share any cell types, so there should not be any (sensible) MNNs between them. This suggests that you should correct 1 and 3 first, before correcting 2; or vice versa, i.e., 2 and 3 and then 1.
Thanks for the information. Just to double check, would this be the order in which I list the variables when running mnnCorrect, or should I run mnnCorrect to times do this type of correction?
The former; the order in which you pass batches to the ...
argument.
Thanks for the quick response. I'll give it a try, and hopefully I'll get it to work properly.
Thanks for all your help resolving my previous problems.
Hi, I don't have much experience with scran, but I've recently been testing the use of mnnCorrect on some of the samples I have. Everything seems to work and I can generate figures up until I try to run TSNE calculations on the generated output. At that point I get the error message: Error in svd(x, nu = 0, nv = k) : a dimension is zero
After some digging around, I haven't quite figured out how to solve this error message in this particular instance. I was just wondering if you've had experience with something like this before, and if I should be doing some extra filtering or normalization pre/post analysis to avoid this.
The general overview of the code I use is the following:
rawData <- Read10X(filepath1) fixed <- SingleCellExperiment(assays = list(counts=as.matrix(rawData)), colData = list(Sample = rep('Fixed',dim(rawData)[2]))) fixed <- computeSumFactors(fixed) fixed <- normalize(fixed)
rawData <- Read10X(filepath2) mixed <- SingleCellExperiment(assays = list(counts=as.matrix(rawData)), colData = list(Sample = rep('Mixed',dim(rawData)[2]))) mixed <- computeSumFactors(mixed) mixed <- normalize(mixed)
out <- mnnCorrect(fixed@assays$data$logcounts, mixed@assays$data$logcounts)
mixed@assays$data$mnnCounts <- out$corrected[[2]] fixed@assays$data$mnnCounts <- out$corrected[[1]] combined <- cbind(fixed, mixed) combined <- runPCA(combined, exprs_values = "mnnCounts") plotReducedDim(combined, use_dimred = "PCA", colour_by = "Sample", shape_by = "Sample")
combined <- runTSNE(combined, exprs_values = "logcounts") plotTSNE(combined, colour_by = "Sample", shape_by = "Sample")
Thanks