Closed SinhaI closed 3 years ago
Hi Sinhal,
Thanks for trying out my package!
It seems from your output object My SR.o$SR
. The SR values are not all properly calculated, e.g the 999th/1000th are NULL
.
Moreover, if the calculation of SR was correct, My SR.o$SR
would be vector
, rather than a list
, which is now your case.
But I do not know the detail of your data. So one suggestion is, since 8k cells are not a big number, you can probably extract the expr matrix with command like assay(SingleCellExperiment object, "count")
, and then give it to LandSCENT, which will make things simpler, and you may find out why the error happened.
Hope it be helpful!
Best, Weiyan
Dear Weiyan,
Thanks for your reply. As you suggested I have tried to extract the data as expr matrix. But still I am getting that error. I think I am doing something terrible wrong here. Below I am giving my code. It will be great if you can suggest something more. Let me know if you need some more information. Thanks for your help.
Best regards, Indranil.
library(LandSCENT) data(net13Jun12.m) data(phenoExample.v) library(Seurat) require(scater)
sub2.pbmc.integrated <- readRDS(file ="sub2_pbmc_SCT_14102020.rds") sub2.sce <- as.SingleCellExperiment(sub2.pbmc.integrated) example.m.sub2 <- assay(sub2.sce, "counts") example.sce <- SingleCellExperiment(assay = list(counts = example.m.sub2)) counts(example.sce) <- as(counts(example.sce), "dgCMatrix") sizeFactors(example.sce) <- librarySizeFactors(example.sce) example.sce <- logNormCounts(example.sce) example.m <- as.matrix(assay(example.sce, i = "logcounts")) min(example.m) example.m[example.m < 0.001] <- 0.1 require(AnnotationDbi) require(org.Hs.eg.db) anno.v <- mapIds(org.Hs.eg.db, keys = rownames(example.m), keytype = "SYMBOL", column = "ENTREZID", multiVals = "first") unique_anno.v <- unique(anno.v) example_New.m <- matrix(0, nrow = length(unique_anno.v), ncol = dim(example.m)[2]) for (i in seq_len(length(unique_anno.v))) { tmp <- example.m[which(anno.v == unique_anno.v[i]) ,] if (!is.null(dim(tmp))) { tmp <- colSums(tmp) / dim(tmp)[1] } example_New.m[i ,] <- example_New.m[i ,] + tmp } rownames(example_New.m) <- unique_anno.v colnames(example_New.m) <- colnames(example.m) example_New.m <- example_New.m[-which(rownames(example_New.m) %in% NA) ,] min(example_New.m) Integration.l <- DoIntegPPI(exp.m = example_New.m, ppiA.m = net13Jun12.m, log_trans = TRUE) str(Integration.l)
SR.o <- CompSRana(Integration.l, local = TRUE, mc.cores = 40)
my.data <- FetchData(sub2.pbmc.integrated,c("type", "patient", "sample", "seurat_cluster", "orig.ident")) boxplot(SR.o$SR ~ my.data$type, main = "SR values against cell types", xlab = "Cell Types", ylab = "SR values")
On Sat, Nov 14, 2020 at 10:15 AM Weiyan Chen notifications@github.com wrote:
Hi Sinhal,
Thanks for trying out my package!
It seems from your output object My SR.o$SR. The SR values are not all properly calculated, e.g the 999th/1000th are NULL.
Moreover, if the calculation of SR was correct, My SR.o$SR would be vector, rather than a list, which is now your case.
But I do not know the detail of your data. So one suggestion is, since 8k cells are not a big number, you can probably extract the expr matrix with command like assay(SingleCellExperiment object, "count"), and then give it to LandSCENT, which will make things simpler, and you may find out why the error happened.
Hope it be helpful!
Best, Weiyan
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ChenWeiyan/LandSCENT/issues/13#issuecomment-727172730, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJWOZ47Z2FPAPMPKB4KHLE3SPZDD5ANCNFSM4TVHL6HQ .
Hi Indranil,
Since you are setting log_trans = TRUE
in DoIntegPPI
, you can actually leave the normalization aside and let the function do it by itself.
And I can tell that you are trying to convert gene IDs. So I attached a script I wrote(and additional resource file) for gene ID converting. Maybe you can grab something useful.
So basically, you only need to extract count matrix from Seurat object ---> check if the row/col names are properly assigned ---> convert the gene IDs ---> then give it to LandSCENT ---> wait to see what will happen.
Best, Weiyan
Hi Weiyan,
Thank you for this tool! I am currently also trying to use the LandSCENT package for a Seurat object and it seems that I am running into the same problem as Indranil. I am trying to get corresponding gene IDs by using the code from the tutorial (step 3.3).
My code:
> #reading in count data from Seurat object
> expression_matrix_ <- as.matrix(GetAssayData(integrated_new, slot="counts", assay="RNA"))
>
> #get gene annotation fitting to the PPI
> require(AnnotationDbi)
> require(org.Hs.eg.db)
> anno.v <- mapIds(org.Hs.eg.db, keys = rownames(expression_matrix_), keytype = "SYMBOL",
> column = "ENTREZID", multiVals = "first")
> unique_anno.v <- unique(anno.v)
> example_New.m <- matrix(0, nrow = length(unique_anno.v), ncol = dim(expression_matrix_)[2])
>
>
> for (i in seq_len(length(unique_anno.v))) {
> tmp <- expression_matrix_[which(anno.v == unique_anno.v[i]) ,]
> if (!is.null(dim(tmp))) {
> tmp <- colSums(tmp) / dim(tmp)[1]
> }
> example_New.m[i ,] <- example_New.m[i ,] + tmp
> }
>
> rownames(example_New.m) <- unique_anno.v
> colnames(example_New.m) <- colnames(expression_matrix_)
> example_New.m <- example_New.m[-which(rownames(example_New.m) %in% NA) ,]
> matrix_sce <- example_New.m
> str(matrix_sce)
My expressionmatrix looks like this:
num [1:24113, 1:20286] 0 0 0 0 0 0 0 0 0 0 ...
- attr(*, "dimnames")=List of 2
..$ : chr [1:24113] "RP11-34P13.7" "FO538757.2" "AP006222.2" "RP4-669L17.10" ...
..$ : chr [1:20286] "AAACCTGAGTCGTTTG_1" "AAACCTGAGTGGTCCC_1" "AAACCTGAGTGTACCT_1" "AAACCTGCACAACGTT_1" ...
And this is what I get out of str(matrix_sce):
num [1:17215, 1:20286] 0 0 0 0 0 0 0 0 0 0 ...
- attr(*, "dimnames")=List of 2
..$ : chr [1:17215] "26155" "339451" "84069" "57801" ...
..$ : chr [1:20286] "AAACCTGAGTCGTTTG_1" "AAACCTGAGTGGTCCC_1" "AAACCTGAGTGTACCT_1" "AAACCTGCACAACGTT_1" ...
So I think that there is a problem with the step of gene ID conversion, but I can't find it.
I also tried your script for gene conversion Weiyan, but I seem to do something wrong, as if I do this I get the error "scRNA-seq data should have the same gene identifier with the network!" when I do the integration step.
I would appreciate any help!
Best, Katharina
Hi Katharina,
Thanks for your feedback.
So for the gene ID conversion, the following is a quick explanation for my script:
HomoG <- read.delim("homologene.data", header = FALSE)
HomoG = NULL
if no need for homologous genes)
ID_conversion <- GeneMap_opt(HomoG = HomoG, geneID, input_type = "SYMBOL", output_type = "ENTREZID", input_species = "HUM", output_species = "HUM")
ID_conversion$GeneMapG
will give you the transferred IDs and you can use ID_conversion$Gene.idx
to subset the original matrix, i.e New.data <- Ori.data[ID_conversion$Gene.idx ,]
. And the row names of New.data
should be assigned with ID_conversion$GeneMapG
.Hope this can help you.
Best, Weiyan
Hi Weiyan,
Thank you for this explanation! I tried this but again it did not work. Here my code:
`expressionmatrix <- as.matrix(GetAssayData(integrated_new, slot="counts", assay="RNA"))
ID_conversion <- GeneMap_opt(HomoG = NULL, rownames(expressionmatrix), input_type = "SYMBOL", output_type = "ENTREZID", input_species = "HUM", output_species = "HUM")
New.data <- expressionmatrix[ID_conversion$Gene.idx ,] rownames(New.data) <- ID_conversion$GeneMapG
Integration.l <- DoIntegPPI(exp.m = New.data, ppiA.m = net17Jan16.m, log_trans = TRUE) `
The error I get here is
Fejl i DoIntegPPI(exp.m = New.data, ppiA.m = net17Jan16.m, log_trans = TRUE) : Non identical!!!
My New.data object looks like this:
num [1:17215, 1:20286] 0 0 0 0 0 0 0 0 0 0 ...
- attr(, "dimnames")=List of 2 ..$ : Named chr [1:17215] "26155" "339451" "84069" "57801" ... .. ..- attr(, "names")= chr [1:17215] "NOC2L" "KLHL17" "PLEKHN1" "HES4" ... ..$ : chr [1:20286] "AAACCTGAGTCGTTTG_1" "AAACCTGAGTGGTCCC_1" "AAACCTGAGTGTACCT_1" "AAACCTGCACAACGTT_1" ...
Can you see what is going wrong here?
Best, Katharina
Hi Katharina,
I forgot to mention that the output of GeneMap_opt
under GeneMapG
is a named vector.
You need to set
names(rownames(New.data)) <- NULL
right before
DoIntegPPI(exp.m = New.data, ppiA.m = net17Jan16.m, log_trans = TRUE)
Best, Weiyan
Hi Weiyan,
Thank you so much for that! This part works now but I am still having the same problems with CompSRana as before. I am sorry for bothering again but I am very stuck with this. When I try to run CompSRana, I get the warning:
Warning message: In mclapply(idx.l, CompSRanaPRL, exp.m = Integration.l$expMC, adj.m = Integration.l$adjMC, : scheduled cores 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40 did not deliver results, all values of the jobs will be affected
If I ignore this and continue I would get the same problem as Indranil. The SR values are NULL
and SR.o$SR
is a list
instead of a vector.
My Integration.l object look like this (which in my eyes seems fine):
str(Integration.l) List of 3 $ expMC: num [1:9949, 1:20286] 0.138 0.138 2.712 0.138 0.138 ... ..- attr(, "dimnames")=List of 2 .. ..$ : chr [1:9949] "29974" "339" "3837" "988" ... .. ..$ : chr [1:20286] "AAACCTGAGTCGTTTG_1" "AAACCTGAGTGGTCCC_1" "AAACCTGAGTGTACCT_1" "AAACCTGCACAACGTT_1" ... $ adjMC: num [1:9949, 1:9949] 0 1 1 1 1 1 1 1 1 1 ... ..- attr(, "dimnames")=List of 2 .. ..$ : chr [1:9949] "29974" "339" "3837" "988" ... .. ..$ : chr [1:9949] "29974" "339" "3837" "988" ... $ data : num [1:17215, 1:20286] 0.138 0.138 0.138 0.138 0.138 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:17215] "26155" "339451" "84069" "57801" ... .. ..$ : chr [1:20286] "AAACCTGAGTCGTTTG_1" "AAACCTGAGTGGTCCC_1" "AAACCTGAGTGTACCT_1" "AAACCTGCACAACGTT_1" ...
In case you have a clue what I could be doing wrong I would be very happy about any tip!
Best, Katharina
Hi Katharina,
The integration of data and PPI network looks fine now.
The warning itself indicates you will not obtain SR correctly for all cells. So I am afraid it cannot be ignored for good.
Warning message:
In mclapply(idx.l, CompSRanaPRL, exp.m = Integration.l$expMC, adj.m = Integration.l$adjMC, :
scheduled cores 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36,
37, 38, 39, 40 did not deliver results, all values of the jobs will be affected
In fact, I have encountered such situation before.
In my case, I was using Linux Server with CentOS 7. And it threw such warning because of running out of memory since the parallel task consumed too much memory. It went away after I changed the mc.cores
into a smaller number.
In your case, I am not sure if it exactly the same. But I suspect you are running on a server too, since you have at least 40 cores to run :) So lowering down the core number may help. And in your case, your cell number is reasonably large, i.e ~20k. So you probably need to check the memory use of your R process first, and then estimate a reasonable core number based on such memory, In case your overall memory would overflow.
Best, Weiyan
I closed this issue for now since inactivation. Please reopen if needed.
Hi Weiyan,
Thank you for a great package. I'm trying to analyze about 8k cells. Basically, I'm using the Seurat object and converting to the SingleCellExperiment object. I am following the tutorial and able to calculate "SR.o" . When I am trying to plot the box plot then I am getting following error.
My SR.o$SR looks like
and my.data$type looks like:
I tried to check SR using following command
I also tried to calculate InferPotency and getting following error:
Any help would be highly appreciated. Thank you.