Closed MarcElosua closed 3 years ago
Hi Marc, Sorry for late response and thanks for using SCDC and I appreciate your feedback! Yes I would like to help you debug if you could send some of your test data to me via email: meichen@live.unc.edu .
Hi @MarcElosua , I've updated the corresponding functions, please let me know if you still encounter errors when you try.
Hi @meichendong,
Appologies I didn't end up sending the test data. I tried to reinstall the pakage but am still getting the same error... I'm sending it over to you now!
Thanks a lot, Marc
Hi,
Any follow-up on this issue?
I am experiencing the same problem. This happens in SCDC_basis
at line 99 if the matrix var.adj
has Inf or NaN values, when e.g. for a given sample and for all cell types some variables/genes have no counts, or zero variance, and
in particular, if the resulting median is zero (line 78).
This is completely reproducible, but depends on the single cell data that is used.
I managed to run SCDC by redefining the basis matrix, essentially commenting line 78 in SCDC_basis
:
my.max <- function(x,...){
y <- apply(x,1,max, na.rm = TRUE)
# y / median(y, na.rm = T) <- HERE
}
but I am unsure as to the consequences this has on the final results.
I am running:
R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 8 (jessie)
# SCDC package installed using
devtools::install_github("renozao/xbioc")
devtools::install_github("meichendong/SCDC")
Thanks for your feedback.
Hi @eboileau , thanks for reporting the issue. Have you checked if there is only 'one subject/individual' in the single cell dataset? Please use SCDC_qc_ONE(), SCDC_prop_ONE() functions. If that's the case. Please let me know if this doesn't solve the problem! Thanks!
The above solved my problem back then :)
Thanks for your quick reply. No, I'm using 2 scRNA-seq data, each with 14 and 20 samples each. And in one dataset, for selected combinations of sample+cell type (think e.g. of markers that are highly expressed in some cell types, and not in others, with significant variation between individuals), some genes have zero counts, or a median of zero, which causes the issue at line 78 in SCDC_basis
. So the scaling (by the median) seems to be problematic...
Hi, any update on the maximal variance weight (MVW) calculation (scaling by the median)?
I quickly compared with and without scaling, globally across the different cell types using the bulk RNA-seq data from your paper (fadista77, with the seger and baron data), but couldn't identify major differences. However, this may be particular to these dataset (cross-cell variation across gene, cell types and samples).
Do you want me to send you some data to reproduce the issue?
Hi @eboileau , example data would be perfect! Sorry I was planning to check this later over the weekend. Please feel free to send me the example data: meichen@live.unc.edu and I will try to figure this out over the weekend! Thanks for your patience!
Update: The major reason the error occurred is that, there are subjects that do not provide cells from some cell types, and this becomes a problem when we try to do division or calculate variance. The functions have been updated.
Thank you very much for developing this tool SCDC, unfortunately my data is wrong with this step (below), in fact I do not understand what this step is trying to do, is he finding the maximum value of the variance matrix?
var.adj <- sapply(unique(sample.id), function(sid) {
my.max(sapply(unique(ct.id), function(id) {
y = countmat[, ct.id %in% id & sample.id %in% sid,
drop = FALSE]
apply(y, 1, var, na.rm = TRUE);
}), na.rm = TRUE)
})
Hi @peachone , thanks for digging into the problem.
This step is trying to calculate the subject-celltype specific expression variance, and extract the max value.
According to your description, I guess it might be that for some subject/celltype, there might be less than 2 single cell samples that allow the function to calculate the variance. If so, an easier step would be to not calculate the MVW and set the SCDC_prop(..., weight.basis = F, ...)
and see if the error still occur. Please feel free to contact me via email: meichen@live.unc.edu
I solved my problem by setting it up, thank you very much! weight.basis = F
Hi, First of all I would like to thank you for developing and maintaining this tool!
I am trying to deconvolute some mixtures of samples with SCDC and I'm coming across the error in the title while running SCDC_prop. I'm attaching the code I'm using below
If you need test data please let me know which is the best way of getting it to you!
Thanks a lot for your time