CostaLab / scMEGA

scMEGA: Single-cell Multiomic Enhancer-based Gene regulAtory network inference
https://costalab.github.io/scMEGA
Other
36 stars 2 forks source link

PairCells error #6

Closed chrismahony closed 2 years ago

chrismahony commented 2 years ago

I am trying your vingette with a Seurat object which is scRNA + scATAC combined. But I get the following error message when I run PairCells(). My scMEGA part of my code is below:

coembed.sub <- RunDiffusionMap(coembed_harmon2, reduction = "harmony")

cols <- ArchR::paletteDiscrete(coembed.sub@meta.data[, "clusters_merge"])

p1 <- DimPlot(coembed.sub, group.by = "clusters_merge", label = TRUE, reduction = "dm", shuffle = TRUE, cols = cols) + xlab("DC 1") + ylab("DC 2")

p1

p2 <- DimPlot(coembed.sub, group.by = "cm_clusters", label = TRUE, reduction = "dm", shuffle = TRUE, cols = cols) + xlab("DC 1") + ylab("DC 2")

p2

DimPlot(coembed.sub, reduction = "dm", group.by = "clusters_merge", split.by = "assay", cols = cols)

DimPlot(coembed.sub, reduction = "dm", group.by = "cm_clusters", split.by = "assay", cols = cols)

df.pair <- PairCells(object = coembed.sub, reduction = "harmony", pair.by = "assay", ident1 = "ATAC", ident2 = "RNA")

Getting dimensional reduction data for pairing cells... Pairing cells using geodesic mode... Constructing KNN graph for computing geodesic distance .. Error in diag<-(*tmp*, value = 0) : only matrix diagonals can be replaced

If you could help I would really appreciate it. Thanks Chris

lzj1769 commented 2 years ago

Hi @chrismahony ,

Can you provide more information about your object coembed.sub? Is the data from our vignettes?

Best, Zhijian

chrismahony commented 2 years ago

Thanks for the reply.

No its not the data from your vingette. Its my own data. I integrated the RNA/ATAC as described in a Seurat Vingette: https://satijalab.org/seurat/articles/atacseq_integration_vignette.html

Here is the code I used to create my object, thanks for the help: rna<-FindVariableFeatures(rna) DefaultAssay(atac_cicero)<-'ATAC' DefaultAssay(rna)<-'RNA' gene.activities <- GeneActivity(atac_cicero, features = VariableFeatures(rna)) atac_cicero[["ACTIVITY"]] <- CreateAssayObject(counts = gene.activities) DefaultAssay(atac_cicero) <- "ACTIVITY" atac_cicero <- NormalizeData(atac_cicero) atac_cicero <- ScaleData(atac_cicero, features = rownames(atac_cicero))

transfer.anchors <- FindTransferAnchors(reference = rna, query = atac_cicero, features = VariableFeatures(object = rna), reference.assay = "RNA", query.assay = "ACTIVITY", reduction = "cca") genes.use <- VariableFeatures(rna)

refdata <- GetAssayData(rna, assay = "RNA", slot = "data")[genes.use, ] imputation <- TransferData(anchorset = transfer.anchors, refdata = refdata, weight.reduction = atac_cicero[["lsi"]], dims = 2:30) atac_cicero[["RNA"]] <- imputation DefaultAssay(atac_cicero) <- "RNA" DefaultAssay(rna) <- "RNA"

coembed <- merge(x = rna, y = atac_cicero) coembed <- ScaleData(coembed, features = genes.use, do.scale = FALSE) coembed <- RunPCA(coembed, features = genes.use, verbose = FALSE) coembed <- RunUMAP(coembed, dims = 1:30) DimPlot(coembed, group.by = c("orig.ident"))

coembed_harmon <- RunHarmony(coembed, group.by.vars = "assay", assay.use = 'ATAC', project.dim = FALSE)

coembed_harmon <- FindNeighbors(coembed_harmon, dims = 2:50, reduction = "harmony", verbose = FALSE)

coembed_harmon <- RunUMAP(coembed_harmon, dims = 2:50, reduction = "harmony") DimPlot(coembed_harmon, group.by = "clusters_merge", reduction="umap")

DimPlot(coembed_harmon, group.by = "assay", reduction="umap")

lzj1769 commented 2 years ago

Hi,

You can do the ATAC/RNA integration using the function CoembedData in scMEGA. It basically performs the same analysis as Seurat does.

See here for an example: https://costalab.github.io/scMEGA/articles/myofibroblast-preprocessing.html

Maybe you can try this and see if the problem is solved.

chrismahony commented 2 years ago

Thanks.

I built the object using your function and then proceeded through your vignette: https://costalab.github.io/scMEGA/articles/myofibroblast-GRN.html

But when I try to RunChromVar() it stalls. I started running yesterday and it was still running this morning. I have have used RunChromVar() for several ATAC objects before, it normally takes some time but no longer than a couple of hours max. If you could help that would be great.

My full code is below:

obj.coembed <- CoembedData( rna, atac_cicero, gene.activities, weight.reduction = "harmony", verbose = FALSE )

p1 <- DimPlot(obj.coembed, group.by = "assay", shuffle = TRUE, label = TRUE) p2 <- DimPlot(obj.coembed, group.by = "orig.ident", shuffle = TRUE, label = TRUE) p1+p2

obj.coembed_harmon <- RunHarmony( obj.coembed, group.by.vars = c("orig.ident", "assay"), reduction = "pca", max.iter.harmony = 30, dims.use = 1:30, project.dim = FALSE, plot_convergence = FALSE )

obj.coembed_harmon <- FindNeighbors(obj.coembed_harmon, dims = 2:50, reduction = "harmony", verbose = FALSE)

obj.coembed_harmon <- RunUMAP(obj.coembed_harmon, dims = 2:50, reduction = "harmony")

p1 <- DimPlot(obj.coembed_harmon, group.by = "assay", shuffle = TRUE, label = TRUE) p2 <- DimPlot(obj.coembed_harmon, group.by = "orig.ident", shuffle = TRUE, label = TRUE) p1 p2

coembed.sub <- RunDiffusionMap(obj.coembed_harmon, reduction = "harmony")

cols <- ArchR::paletteDiscrete(coembed.sub@meta.data[, "clusters_merge"])

p <- DimPlot(coembed.sub, group.by = "clusters_merge", label = TRUE, reduction = "dm", shuffle = TRUE, cols = cols) + xlab("DC 1") + ylab("DC 2")

p

df.pair <- PairCells(object = coembed.sub, reduction = "harmony", pair.by = "assay", ident1 = "ATAC", ident2 = "RNA")

sel_cells <- c(df.pair$ATAC, df.pair$RNA) coembed.sub2 <- coembed.sub[, sel_cells]

options(repr.plot.height = 5, repr.plot.width = 10) p3<-DimPlot(coembed.sub2, reduction = "dm", group.by = "clusters_merge", split.by = "assay", cols = cols) p3

obj.pair <- CreatePairedObject(df.pair = df.pair, object = coembed.sub2, use.assay1 = "RNA", use.assay2 = "ATAC")

obj.pair <- AddTrajectory(object = obj.pair, trajectory = c(2, 1), group.by = "clusters_merge", reduction = "dm", dims = 1:3, use.all = FALSE)

obj <- obj.pair[, !is.na(obj.pair$Trajectory)]

p <- TrajectoryPlot(object = obj, reduction = "dm", continuousSet = "blueYellow", size = 1, addArrow = FALSE)

p

pfm <- getMatrixSet( x = JASPAR2020, opts = list(collection = "CORE", tax_group = 'vertebrates', all_versions = FALSE) )

obj <- AddMotifs( object = obj, genome = BSgenome.Mmusculus.UCSC.mm10, pfm = pfm, assay = "ATAC" )

obj <- RunChromVAR( object = obj, genome = BSgenome.Mmusculus.UCSC.mm10, assay = "ATAC" )

My output from RunChromVar(): Computing GC bias per region Selecting background regions Computing deviations from background

And the Traceback:

Error in get(as.character(FUN), mode = "function", envir = envir) : object 'as.SimpleList' of mode 'function' was not found 16. selectChildren(pids[!fin], -1) 15. parallel::mccollect(wait = TRUE) 14. bpstart(BPPARAM, length(X)) 13. bpstart(BPPARAM, length(X)) 12. bplapply(X, FUN, ..., BPREDO = BPREDO, BPPARAM = BPPARAM) 11. bplapply(X, FUN, ..., BPREDO = BPREDO, BPPARAM = BPPARAM) 10. bplapply(peak_indices, compute_deviations_single, counts_mat = counts_mat, background_peaks = background_peaks, expectation = expectation) 9. bplapply(peak_indices, compute_deviations_single, counts_mat = counts_mat, background_peaks = background_peaks, expectation = expectation) 8. compute_deviations_core(counts(object), peak_indices, background_peaks, expectation, colData = colData(object)) 7. .local(object, annotations, ...) 6. chromVAR::computeDeviations(object = chromvar.obj, annotations = motif.matrix, background_peaks = bg) 5. chromVAR::computeDeviations(object = chromvar.obj, annotations = motif.matrix, background_peaks = bg) 4. RunChromVAR.ChromatinAssay(object = object[[assay]], genome = genome, motif.matrix = motif.matrix, ...) 3. RunChromVAR(object = object[[assay]], genome = genome, motif.matrix = motif.matrix, ...) 2. RunChromVAR.Seurat(object = obj, genome = BSgenome.Mmusculus.UCSC.mm10, assay = "ATAC") 1. RunChromVAR(object = obj, genome = BSgenome.Mmusculus.UCSC.mm10, assay = "ATAC")

chrismahony commented 2 years ago

I just found this: https://github.com/timoast/signac/issues/1103

When I run :

library(BiocParallel) register(SerialParam())

And then repeat RunChromVar() it completes no problem.

Ill continue with the Vingette and let you know if there are any problems.

Thanks Chris

chrismahony commented 2 years ago

Hi

I have a few more issue, if you could help that would be really great.

res_persis_split <- SelectGenes(object = obj_persis_split, labelTop1 = 0, labelTop2 = 0)

Error in order(apply(mat, 1, which.max)) : unimplemented type 'list' in 'orderVector1' 5. order(apply(mat, 1, which.max)) 4. TrajectoryHeatmap(trajATAC, varCutOff = 0, maxFeatures = nrow(trajATAC), pal = paletteContinuous(set = "solarExtra"), limits = c(-2, 2), name = "Chromatin accessibility", returnMatrix = TRUE) 3. withCallingHandlers(expr, message = function(c) if (inherits(c, classes)) tryInvokeRestart("muffleMessage")) 2. suppressMessages(TrajectoryHeatmap(trajATAC, varCutOff = 0, maxFeatures = nrow(trajATAC), pal = paletteContinuous(set = "solarExtra"), limits = c(-2, 2), name = "Chromatin accessibility", returnMatrix = TRUE)) 1. SelectGenes(object = obj_persis_split, labelTop1 = 0, labelTop2 = 0)

Although this doesn't work, I can continue and when I get to:

motif.matching_persis <- obj_persis_split@assays$ATAC@motifs@data colnames(motif.matching_persis) <- obj_persis_split@assays$ATAC@motifs@motif.names motif.matching_persis <- motif.matching_persis[unique(df.p2g$peak), unique(tf.gene.cor_persis$tf)]

I get: Error in motif.matching_persis[unique(df.p2g$peak), unique(tf.gene.cor_persis$tf)] : invalid or not-yet-implemented 'Matrix' subsetting 3. stop("invalid or not-yet-implemented 'Matrix' subsetting") 2. motif.matching_persis[unique(df.p2g$peak), unique(tf.gene.cor_persis$tf)] 1. motif.matching_persis[unique(df.p2g$peak), unique(tf.gene.cor_persis$tf)]

I dont seem to have the df.p2g object, should this have been calculated at an earlier step?

Thanks, Chris

lzj1769 commented 2 years ago

Hi @chrismahony ,

It seems that there is something wrong with gene selection.

df.p2g is an object containing the predicted peak-to-gene links that we will use to associate TFs to genes, since you didn't successfully generate the peak-to-gene links, it's not possible to run the following analysis.

I am wondering if it's possible for you to share the two objects with me so that I can test the package with your data.

Best, Zhijian

chrismahony commented 2 years ago

Thanks for this. Unfortunately, my supervisor is not comfortable with sharing the data as its not yet published.

What I could do is subset one sample form the RNA and ATAC data set and send this for you to look at?

Or perhaps we could set a zoom call to look at the data?

Thanks Chris

lzj1769 commented 2 years ago

@chrismahony

Unfortunately, my supervisor is not comfortable with sharing the data as its not yet published.

This is totally understandable.

What I could do is subset one sample form the RNA and ATAC data set and send this for you to look at? Or perhaps we could set a zoom call to look at the data?

This sounds nice. You can send me the link for data downloading via email: zhijian.li@rwth-aachen.de Then I will check it out and try to run scmega on it. Once I have some results, we can have a zoom call to discuss it.

Best, Zhijian

chrismahony commented 2 years ago

Thanks for your understanding and help.

Ill prepare these and send them to you soon.

Thanks Chris

lzj1769 commented 2 years ago

Closed with no further comments.