hemberg-lab / SC3

A tool for the unsupervised clustering of cells from single cell RNA-Seq experiments
http://bioconductor.org/packages/SC3
GNU General Public License v3.0
119 stars 55 forks source link

Problems with calculate_distance() and sc3_calc_dists() functions #50

Closed sozzznanie closed 6 years ago

sozzznanie commented 6 years ago

Hello,

I give a SCESet to sc3_calc_dists() and receive this error:

sce_full <- sc3_calc_dists(sce_full) Calculating distances between the cells... Error in if (object@sc3$n_cores > length(distances)) { : argument is of length zero

When I try to run sc3() function I have this:

sce_full <- calculateQCMetrics(sce_full) sce_full <- sc3(sce_full, ks = 8:10, biology = TRUE) Setting SC3 parameters... Setting a range of k... Calculating distances between the cells... starting worker pid=31908 on localhost:11626 at 15:25:45.606 starting worker pid=31924 on localhost:11626 at 15:25:46.109 starting worker pid=31940 on localhost:11626 at 15:25:46.629 Loading required package: SC3 Loading required package: SC3 Loading required package: SC3 Error: package or namespace load failed for ˜SC3™ in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]): there is no package called rhdf5™ Loading required package: rngtools Loading required package: rngtools Loading required package: foreach Loading required package: pkgmaker Loading required package: pkgmaker Loading required package: rngtools Loading required package: registry Loading required package: registry Loading required package: pkgmaker Loading required package: registry
Attaching package: pkgmaker™
Attaching package: ˜pkgmaker™
The following object is masked from ˜package:base™: isNamespaceLoaded The following object is masked from package:base™: isNamespaceLoaded
Attaching package: pkgmaker™
The following object is masked from ˜package:base™: isNamespaceLoaded
Error in calculate_distance(dataset, i) : could not find function "calculate_distance" Error in calculate_distance(dataset, i) : could not find function "calculate_distance" Error in calculate_distance(dataset, i) : could not find function "calculate_distance" Error in checkForRemoteErrors(val) : 3 nodes produced errors; first error: Error in calculate_distance(dataset, i) : could not find function "calculate_distance"

Actually, I have rhdf5 package loaded. It is strange that it cannot read it. I tried to load calculate_distance() function manually in R but it did not help. Does anyone have an idea how to get calculate_distance function?

Session Info

rhdf5_2.20.0
SC3_1.4.2 BiocParallel_1.10.1

wikiselev commented 6 years ago

Hi, looks like you need to update your Bioconductor to version 3.6 which may also require to upgrade your R. The current version of SC3 is 1.6 which is available under Bioconductor 3.6. Could you please update everything and try again?

sozzznanie commented 6 years ago

Dear Vladimir,

Thank you for the answer. Unfortunately, I cannot upgrade R so easily on a server at the Institute because it does not belong to me but I tried it with Bioconductor 3.6 on my personal laptop. Then I got problems even earlier with creating a new SCEset:

sce_full = scater::newSCESet(countData=as.matrix(counts_cells), phenoData=pheno, featureData=feature) Error in as.data.frame.default(from) : cannot coerce class "structure("AnnotatedDataFrame", package = "Biobase")" to a data.frame In addition: Warning message: 'newSCESet' is deprecated. Use 'SingleCellExperiment' instead. See help("Deprecated")

Is it true that SCEset object is about to be removed from R in the next versions?

Okey, I loaded SCEset object from the server and got this:

test = sce_full sce = scater::calculateQCMetrics(test) Error in scater::calculateQCMetrics(test) : object must be a SingleCellExperiment

Good, I tried to use SingleCellExperiment object:

sce_full <- SingleCellExperiment(assays = list(counts = as.matrix(counts_cells)), colData = tmp, rowData = rowdata) sce = scater::calculateQCMetrics(sce_full) (treutlein_sceset = sc3(sce, ks = 8:11)) Setting SC3 parameters... Error in assay(object, i = exprs_values) : 'assay(, i="character", ...)' invalid subscript 'i' 'i' not in names(assays())

And I do not know what to do with the last error. It seems that there is a problem with names but it should be fine

all(colnames(counts_cells) == rownames(tmp)) [1] TRUE all(rownames(counts_cells) == rownames(rowdata)) [1] TRUE

Do you probably know what is the reason for that error?

Session Info scater_1.6.1 rhdf5_2.22.0 BiocInstaller_1.28.0
[4] BiocParallel_1.12.0 ggplot2_2.2.1 SC3_1.6.0
[7] SingleCellExperiment_1.0.0 SummarizedExperiment_1.8.0 DelayedArray_0.4.1
[10] matrixStats_0.52.2 Biobase_2.38.0 GenomicRanges_1.30.0
[13] GenomeInfoDb_1.14.0 IRanges_2.12.0 S4Vectors_0.16.0
[16] BiocGenerics_0.24.0

wikiselev commented 6 years ago

Hi Polina,

Regarding your first error. SCEset was deprecated in the new release of Bioconductor (1st November 2017) due to creation of the SingleCellExperiment class, which is more functional and inherits some important features of other standard Bioconductor classes.

Regarding the second error. By default SC3 operates on the logcounts slot of the SingleCellExperiment object (which is expected to contain log-transformed and normalised expression values). In your case there was only the counts slot and hence an error. If you follow the SC3 vignette and create a logcounts slot, it should work. I will add a note about logcounts slot to the vignette. I agree that at the moment it's not really obvious.

Hope this helps, Vlad

wikiselev commented 6 years ago

I have just updated SC3 (version 1.7.2), now it will use counts slot to filter the genes and logcounts slot to perform clustering. The changes should appear on Bioconductor in a couple of days. Once they are there please have a proper look at vignette, where this will be described in more details. Please feel free to close this issue if I've answered you questions.

sozzznanie commented 6 years ago

Dear Vladimir,

Thank you for letting me know about updates. I ran SC3 a year ago and I think I needed just a count matrix. That is why I got errors with SingleCellExperiment last week because I was guided by my old script. This morning, I created SingleCellExperiment with normcounts, logcounts, and counts, and sc3 function worked (I guess it was version 1.6.0) :)

Sorry for an additional question, I guess this is the last one. When I used SCESet, I made normalization like this:

sce_full = scater::newSCESet(countData=as.matrix(counts_cells), phenoData=pheno, featureData=feature) * cl = scran::quickCluster(sce_full) sce_full_lg = scran::computeSumFactors(sce_full, clusters = cl) sce_full_lg = scater::normalize(sce_full_lg) scran_norm = exprs(sce_full_lg)

*counts_cells are UMI counts

Can I use scran_norm table as logcounts in the version 1.7.2 of SC3 or I need to use it as normcounts and then take logcounts as log2(as.matrix(normcounts) + 1)?

P.S. Very sad that SCESet is deprecated now. I liked it a lot, it was stable and easy to use...

Many thanks.

Best regards, Polina

wikiselev commented 6 years ago

Hi Polina,

I think scran used to automatically log-transform the data and write it to expression slot, so you can write scran_norm directly to logcounts. But maybe @LTLA can confirm, please?

Cheers, Vlad

LTLA commented 6 years ago

Yes, log-transformation is automatic with scater::normalize(), and the output will be stored in logcounts. However, if you set return_log=FALSE, the function will not log-transform and the output will be stored in normcounts instead. Check out ?normalize for more details.

wikiselev commented 6 years ago

Great, many thanks @LTLA!

sozzznanie commented 6 years ago

Thank you both @LTLA and @wikiselev for your comments, they were really helpful!