Open liyang24 opened 1 year ago
Hello @liyang24 I am not a specialist but here is what I did to implement the rat genome (rn6)
library(org.Rn.eg.db) library(TxDb.Rnorvegicus.UCSC.rn6.refGene)
geneAnnotationRN6 <- createGeneAnnotation(TxDb = TxDb.Rnorvegicus.UCSC.rn6.refGene, OrgDb = org.Rn.eg.db)
I had a "chr" prefix problem (chromsome are named "chr1" in the annotation but only "1" in my data) to remove the "chr" and obtain a file with adequate formate, I did, quite laboriously :
geneAnno <- geneAnnotationRN6$gene exonAnno <- geneAnnotationRN6$exon TSSAnno <- geneAnnotationRN6$TSS
geneAnno <- keepStandardChromosomes(geneAnno, pruning.mode="coarse") exonAnno <- keepStandardChromosomes(exonAnno, pruning.mode="coarse") TSSAnno <- keepStandardChromosomes(TSSAnno, pruning.mode="coarse")
geneAnno <- renameSeqlevels(geneAnno, c(chr1="1", chr2="2", chr3="3", chr4="4", chr5="5", chr6="6", chr7="7", chr8="8", chr9="9", chr10="10", chr11="11", chr12="12", chr13="13", chr14="14", chr15="15", chr16="16", chr17="17", chr18="18", chr19="19", chr20="20", chrX="X", chrY="Y")) exonAnno <- renameSeqlevels(exonAnno, c(chr1="1", chr2="2", chr3="3", chr4="4", chr5="5", chr6="6", chr7="7", chr8="8", chr9="9", chr10="10", chr11="11", chr12="12", chr13="13", chr14="14", chr15="15", chr16="16", chr17="17", chr18="18", chr19="19", chr20="20", chrX="X", chrY="Y")) TSSAnno <- renameSeqlevels(TSSAnno, c(chr1="1", chr2="2", chr3="3", chr4="4", chr5="5", chr6="6", chr7="7", chr8="8", chr9="9", chr10="10", chr11="11", chr12="12", chr13="13", chr14="14", chr15="15", chr16="16", chr17="17", chr18="18", chr19="19", chr20="20", chrX="X", chrY="Y"))
geneAnnotationRN6.2 <- createGeneAnnotation( genome = NULL, TxDb = NULL, OrgDb = NULL, genes = geneAnno, exons = exonAnno, TSS = TSSAnno, annoStyle = NULL )
then i copied the peaktogene to run it locally (because I don't known how to modify it directly) with some modifications: -I had "rn6" to the genome list and the annotation : if (genome == "rn6") { gene_anno <- geneAnnotationRN6.2
then I run SelectGenes also locally, with some modifications : I replaced RNA assy with SoupXRNA because I had to run SoupX to clean the data I added genome = "rn6" in the selectgenes function and in the calling of the peaktogene function
And I have to admit that, to my surprise, it did worked :)
PS I had to implement the motif names call solution proposed by @AmandaKedaigle in the issue " Motif matching when using mouse data https://github.com/CostaLab/scMEGA/issues/18 " every time needed (in the GetTFGeneCorrelation function, in the colnames(motif.matching)) if not the different format of the motif names blocked the analysis.
Moreover, I tried to used the jaspar2022 instead of the Jasper2020 and there is a problem with mofit name redundancy
I hope it will help and also that I did not do something stupid, but the results seems consistent with my prior knownledge of the biological system so....
Best David
Hello, according to your method, I also created a gene annotation for macaques, and SelectGenes by using the self-defined gene annotation file. However, after that, I don't know how to calculate and define the rest of this function, especially how to use the gene annotation information. Only the selection logic of genome annotation is written, but the operation of selecting genes is not carried out.
sel.tfs <- SelectTFs(object = pbmc.t.cells,
return.heatmap = TRUE,
cor.cutoff = 0.4)
df.cor <- sel.tfs$tfs ht <- sel.tfs$heatmap pdf("./heatmap_SelectTFs.pdf") t = draw(ht) dev.off()
annotation <- genes(EDB, return.type = "GRanges") exon_annotation <- exons(EDB, return.type = "GRanges") TSS_annotation <- promoters(EDB, upstream=0, downstream=1, return.type = "GRanges")
annotation <- genes(EDB, return.type = "GRanges") exon_annotation <- exons(EDB, return.type = "GRanges") TSS_annotation <- promoters(EDB, upstream=0, downstream=1, return.type = "GRanges")
geneAnno <- annotation exonAnno <- exon_annotation TSSAnno <- TSS_annotation
ucsc.levels <- str_replace(string = paste("chr", seqlevels(annotation), sep = ""), pattern = "chrMT", replacement = "chrM") seqlevels(annotation) <- ucsc.levels seqlevels(geneAnno) <- ucsc.levels seqlevels(exonAnno) <- ucsc.levels seqlevels(TSSAnno) <- ucsc.levels
geneAnnotationMmu <- createGeneAnnotation( genome = NULL, TxDb = NULL, OrgDb = NULL, genes = geneAnno, exons = exonAnno, TSS = TSSAnno, annoStyle = NULL ) SelectGenes <- function(object, atac.assay = "ATAC", rna.assay = "RNA", var.cutoff.gene = 0.9, trajectory.name = "Trajectory", distance.cutoff = 2000, groupEvery = 1, cor.cutoff = 0, fdr.cutoff = 1e-04, return.heatmap = TRUE, labelTop1 = 10, labelTop2 = 10, genome = "mihou") { if (is.character(genome)) { if (genome == "mihou") { gene_anno <- geneAnnotationMmu } else { stop("Unsupported genome annotation") } } else if (is(genome, "GRanges")) { gene_anno <- genome } else { stop("Unsupported genome annotation format") }
gene = c(),
cor = c(0.5, 0.6, 0.7)
) heatmap_data <- matrix(rnorm(100), nrow = 10, ncol = 10) rownames(heatmap_data) <- paste0("Gene", 1:10) colnames(heatmap_data) <- paste0("Cell", 1:10)
heatmap <- Heatmap(heatmap_data, name = "Gene Expression", show_row_names = TRUE, show_column_names = TRUE) return(list(p2g = tfs, heatmap = heatmap)) }
object = pbmc.t.cells,
atac.assay = "ATAC",
rna.assay = "RNA",
var.cutoff.gene = 0.9,
trajectory.name = "Trajectory",
distance.cutoff = 2000,
groupEvery = 1,
cor.cutoff = 0,
fdr.cutoff = 1e-04,
return.heatmap = TRUE,
labelTop1 = 10,
labelTop2 = 10,
genome = "mihou"
)
df.p2g <- sel.genes$p2g
ht <- sel.genes$heatmap
In addition, I also want to ask, when I am executing GetTFGeneCorrelation:
tf.gene.cor <- GetTFGeneCorrelation(object = pbmc.t.cells,
tf.use = df.cor$tfs,
gene.use = unique(df.p2g$gene),
tf.assay = "chromvar",
gene.assay = "RNA",
trajectory.name = "Trajectory")
there will be an error:
Find 553 shared features!
Creating Trajectory Group Matrix..
Some values are below 0, this could be the Motif activity matrix in which scaleTo should be set = NULL.
Continuing without depth normalization!
Smoothing...
Creating Trajectory Group Matrix..
Smoothing...
Error in tf_activity[tf.use, ] : subscript out of bounds
Calls: GetTFGeneCorrelation
My question may be stupid, and I look forward to your answer.Thank you!
Hi! i noticed that
SelectGenes
Available genome are: hg19, hg38, mm9, and mm10 can i use the genomes of other species? If so, how exactly should I do it