Closed anirudhpatir closed 2 years ago
Hi @anirudhpatir
The steps you described are almost the same as the steps that were done in the manuscript.
Only step 3 was not done, the regulons were not combined only the gene names were converted to their homolog gene name.
I cite from the methods
"The 259 human regulons from the Darmanis et al.12 data set and the human homologs of the mouse regulons were evaluated on this merged matrix to obtain the binary regulon activity containing 410 regulons."
For example if you look at this figure (from the manuscript), both the mouse and human DLX1 regulon are shown seperatly (they are in fact two seperate regulons indeed). .
As for the question on how to convert gene names to their homologs name in a different species. We don't provide a script for that. But this information is stored in gene trees databases for example in Ensembl (for more info see: https://www.ensembl.org/Help/View?id=137) and can be accessed using biomart. Biomart has an api which can be accessed both in R (biomaRt: https://bioconductor.org/packages/release/bioc/html/biomaRt.html) and python (pybiomart: https://github.com/jrderuiter/pybiomart).
When you start this analysis you'll quickly notice that there not always a one to one mapping of genes in different species (one human gene might map to multiple genes in mouse and vice verca). You could opt to only stick to the one-to-one mapping genes or use a more complicated approach, that's op to you.
I hope this helps.
Best,
Seppe
Below you can find the actual code which was used to generate these figures in the manuscript.
Note! This code is very old, so might be a bit outdated.
#####################################Create figures for Human Brain data ############################################
date()
setwd("/media/seq-srv-06/lcb/cbravo/HumanBrain_2.0")
load("new_esetHumanBrain_SC.RData")
setwd("/media/seq-srv-06/lcb/cbravo/Comparison_HM_all")
library(SCENIC)
library(data.table)
#####################################Comparison#############################################
#1. Load data
load("/media/seq-srv-06/lcb/saibar/1.scRNAseq/2016-04_SCENIC/00_WorkflowApplication/SCENIC_Workflow_MouseBrain/int/4.1_tsneModuleMatrix_PCA.RData")
Mousetsne <- tsneModuleMatrixPCA
rm(tsneModuleMatrixPCA)
load("/media/seq-srv-06/lcb/cbravo/HumanBrain_2.0/Processing/int/4.2_tsneAUC_PCA.RData")
Humantsne <- tsneAUCPCA
rm(tsneAUCPCA)
load("/media/seq-srv-06/lcb/cbravo/HumanBrain_2.0/Figure production/HumanVSMouse_Data/Mouse_cellsAUC.RData")
MouseAUC <- cellsAUC
rm(cellsAUC)
load("/media/seq-srv-06/lcb/cbravo/HumanBrain_2.0/Figure production/HumanVSMouse_Data/Human_cellsAUC.RData")
HumanAUC <- cellsAUC
rm(cellsAUC)
load("/media/seq-srv-06/lcb/saibar/1.scRNAseq/2016-04_SCENIC/00_WorkflowApplication/SCENIC_Workflow_MouseBrain/int/3.1_tfModules.RData")
MousetfModules <- tfModules
rm(tfModules)
load("/media/seq-srv-06/lcb/cbravo/HumanBrain_2.0/AUCell/int/3.1_tfModules.RData")
HumantfModules <- tfModules
rm(tfModules)
#2. Check with TFs have homologs for each
library(biomaRt)
human <- useMart("ensembl", dataset = "hsapiens_gene_ensembl")
mouse <- useMart("ensembl", dataset = "mmusculus_gene_ensembl")
Mousetfs <- gsub( "\\s\\(\\d+g)","",colnames(MouseAUC))
Humantfs <- gsub( "\\s\\(\\d+g)","",colnames(HumanAUC))
Homog_tfs_MH <- getLDS(attributes = "mgi_symbol", filters = "mgi_symbol", values = Mousetfs, mart = mouse, attributesL = "hgnc_symbol", martL = human)
Homog_tfs_MH <- Homog_tfs_MH[-c(74, 13),] #Taf1 and Zfp160 have more than one homolog.
rownames(Homog_tfs_MH) <- Homog_tfs_MH[,1]
Homog_tfs_HM <- getLDS(attributes = "hgnc_symbol", filters = "hgnc_symbol", values = Humantfs, mart = human, attributesL = "mgi_symbol", martL = mouse)
#Homog_tfs_HM <- Homog_tfs_HM[-c(93,30,31,32,33,34,35,36,37,78,79,80,123,124,42,43,69),]
Homog_tfs_HM <- Homog_tfs_HM[-c(75,93,26:33,63,64,65,114,115,48,49),]
#SSX3,GTF3C2,ZFY and ZNF354A have more than one homolog.
#Homog_tfs_HM[Homog_tfs_HM[,1] == "ZNF354A",]
rownames(Homog_tfs_HM) <- Homog_tfs_HM[,1]
#When we have more than one homolog, we take one as representative.
#3. Create TF modules
# Homolog to mouse modules in human (mouse modules in human)
tfModulesM_H <- list()
for (name in Mousetfs){
namex <- paste(name, "")
Mouse_module <- colnames(MouseAUC)[grep(namex, colnames(MouseAUC))]
Targets_mouse <- MousetfModules[[Mouse_module]]
M_H_homog <- getLDS(attributes = "mgi_symbol", filters = "mgi_symbol", values = Targets_mouse, mart = mouse, attributesL = "hgnc_symbol", martL = human)
nrgenes <- paste("(",length(unique(M_H_homog[,2])), sep="")
nrgenes <- paste(nrgenes, "g)", sep="")
text <- paste(Homog_tfs_MH[name, 2], nrgenes)
tfModulesM_H[[text]] <- unique(M_H_homog[,2])
}
length(tfModulesM_H) == length(Mousetfs)
names(tfModulesM_H)
save(tfModulesM_H, file="int/tfModulesM_H.RData")
# Homolog to human modules in mouse (human modules in mouse)
tfModulesH_M <- list()
for (name in Humantfs){
namex <- paste("^", name,sep="")
namex <- paste(namex, "")
Human_module <- colnames(HumanAUC)[grep(namex, colnames(HumanAUC))]
Targets_human <- HumantfModules[[Human_module]]
H_M_homog <- getLDS(attributes = "hgnc_symbol", filters = "hgnc_symbol", values = Targets_human, mart = human, attributesL = "mgi_symbol", martL = mouse)
nrgenes <- paste("(",length(unique(H_M_homog[,2])), sep="")
nrgenes <- paste(nrgenes, "g)", sep="")
text <- paste(Homog_tfs_HM[name, 2], nrgenes)
tfModulesH_M[[text]] <- unique(H_M_homog[,2])
}
length(tfModulesH_M) == length(Humantfs)
names(tfModulesH_M)
save(tfModulesH_M, file="int/tfModulesH_M.RData")
#3. AUCell
load("/media/seq-srv-06/lcb/cbravo/HumanBrain_2.0/AUCell/int/3.2_cellsRankings.RData")
cellsRankings_human <- cellsRankings
rm(cellsRankings)
load("/media/seq-srv-06/lcb/saibar/1.scRNAseq/2016-04_SCENIC/00_WorkflowApplication/SCENIC_Workflow_MouseBrain/int/3.2_cellsRankings.RData")
cellsRankings_mouse <- cellsRankings
rm(cellsRankings)
AUC_MH <- AUCell.calcAUC(tfModulesM_H, cellsRankings_human, nCores=10)
# Using 10 cores.
# Genes in the gene sets NOT available in the dataset:
# FOS (703g): 65 (9% of 703)
# ETS1 (676g): 79 (12% of 676)
# FLI1 (650g): 107 (16% of 650)
# SF1 (607g): 35 (6% of 607)
# FOSB (565g): 58 (10% of 565)
# SOX10 (539g): 36 (7% of 539)
# KMT2A (496g): 41 (8% of 496)
# ELK3 (451g): 40 (9% of 451)
# EGR4 (436g): 38 (9% of 436)
# IRF1 (431g): 74 (17% of 431)
# JUN (374g): 30 (8% of 374)
# NR3C1 (365g): 24 (7% of 365)
# UBE2K (343g): 23 (7% of 343)
# ATF2 (332g): 20 (6% of 332)
# HLF (328g): 19 (6% of 328)
# SOX8 (318g): 18 (6% of 318)
# TEF (317g): 17 (5% of 317)
# NFE2L2 (305g): 55 (18% of 305)
# ELF1 (283g): 37 (13% of 283)
# NFKB1 (287g): 43 (15% of 287)
# BPTF (275g): 17 (6% of 275)
# BCLAF1 (256g): 18 (7% of 256)
# GATA2 (268g): 41 (15% of 268)
# EBF1 (260g): 39 (15% of 260)
# RFX3 (210g): 28 (13% of 210)
# REL (222g): 55 (25% of 222)
# IRF2 (191g): 15 (8% of 191)
# ATF3 (177g): 27 (15% of 177)
# CREB1 (181g): 17 (9% of 181)
# STAT6 (173g): 48 (28% of 173)
# ZIC3 (166g): 10 (6% of 166)
# JUNB (143g): 20 (14% of 143)
# MAF (145g): 25 (17% of 145)
# CEBPB (140g): 41 (29% of 140)
# ID2 (125g): 17 (14% of 125)
# PRDM1 (122g): 16 (13% of 122)
# KLF4 (118g): 12 (10% of 118)
# LEF1 (118g): 7 (6% of 118)
# DLX2 (109g): 6 (6% of 109)
# PRDM16 (112g): 22 (20% of 112)
# IRF9 (101g): 16 (16% of 101)
# GMEB1 (107g): 14 (13% of 107)
# STAT1 (101g): 19 (19% of 101)
# DLX5 (97g): 10 (10% of 97)
# FOXN3 (95g): 10 (11% of 95)
# NFATC2 (94g): 20 (21% of 94)
# FOSL2 (91g): 11 (12% of 91)
# ATF4 (90g): 8 (9% of 90)
# NFIC (102g): 16 (16% of 102)
# E2F1 (86g): 8 (9% of 86)
# FOXO1 (84g): 10 (12% of 84)
# EGR1 (75g): 9 (12% of 75)
# NEUROD1 (76g): 4 (5% of 76)
# RORA (76g): 4 (5% of 76)
# KLF2 (76g): 7 (9% of 76)
# ZBTB7A (73g): 6 (8% of 73)
# DLX1 (68g): 7 (10% of 68)
# ZNF263 (66g): 4 (6% of 66)
# STAT3 (62g): 10 (16% of 62)
# HIF1A (59g): 6 (10% of 59)
# ARNT2 (57g): 2 (4% of 57)
# SOX2 (56g): 6 (11% of 56)
# NFATC1 (55g): 12 (22% of 55)
# ZMAT4 (54g): 4 (7% of 54)
# DEAF1 (52g): 4 (8% of 52)
# NFYB (49g): 3 (6% of 49)
# GABPA (51g): 4 (8% of 51)
# POU3F1 (48g): 2 (4% of 48)
# JUND (44g): 4 (9% of 44)
# ZNF462 (48g): 3 (6% of 48)
# SMC3 (42g): 3 (7% of 42)
# CEBPA (43g): 4 (9% of 43)
# PBX1 (41g): 4 (10% of 41)
# FOXC1 (42g): 3 (7% of 42)
# SOX17 (42g): 3 (7% of 42)
# RORB (38g): 1 (3% of 38)
# TCF3 (36g): 3 (8% of 36)
# BHLHE41 (38g): 4 (11% of 38)
# CHD2 (35g): 2 (6% of 35)
# ELK1 (34g): 4 (12% of 34)
# ETS2 (36g): 3 (8% of 36)
# MAX (49g): 15 (31% of 49)
# SMAD1 (35g): 4 (11% of 35)
# CREB3 (32g): 6 (19% of 32)
# GLI3 (29g): 2 (7% of 29)
# TIA1 (31g): 2 (6% of 31)
# TBR1 (29g): 4 (14% of 29)
# DBP (26g): 1 (4% of 26)
# CEBPZ (24g): 1 (4% of 24)
# NR2E1 (24g): 3 (12% of 24)
# MZF1 (23g): 5 (22% of 23)
# NR1D1 (25g): 4 (16% of 25)
# PKNOX2 (25g): 2 (8% of 25)
# SP1 (25g): 3 (12% of 25)
# SREBF2 (23g): 1 (4% of 23)
# JDP2 (24g): 1 (4% of 24)
# CUX1 (22g): 1 (5% of 22)
# EGR2 (22g): 4 (18% of 22)
# NA (18g): 2 (11% of 18)
# POU2F1 (19g): 5 (26% of 19)
# RELB (21g): 3 (14% of 21)
# ZEB1 (20g): 4 (20% of 20)
# ESRRA (19g): 1 (5% of 19)
# KLF5 (19g): 2 (11% of 19)
# RUNX1 (20g): 4 (20% of 20)
# E2F4 (15g): 1 (7% of 15)
# ERF (16g): 4 (25% of 16)
# MYNN (17g): 4 (24% of 17)
# PBX3 (16g): 3 (19% of 16)
# TEAD1 (16g): 1 (6% of 16)
# THRA (17g): 1 (6% of 17)
# FOXO6 (16g): 2 (12% of 16)
# GLIS2 (17g): 1 (6% of 17)
# HINFP (16g): 2 (12% of 16)
# IKZF2 (16g): 2 (12% of 16)
# TCF4 (14g): 1 (7% of 14)
# ZNF664 (17g): 2 (12% of 17)
# GTF2A1 (13g): 1 (8% of 13)
# SP3 (15g): 1 (7% of 15)
# MKX (14g): 4 (29% of 14)
# POU3F3 (15g): 1 (7% of 15)
# TCF7L2 (14g): 2 (14% of 14)
# NKX2-2 (13g): 1 (8% of 13)
# USF2 (14g): 1 (7% of 14)
# ZNF788 (14g): 4 (29% of 14)
# ACAA1 (13g): 1 (8% of 13)
# HMGA1 (13g): 1 (8% of 13)
# NR1H2 (13g): 3 (23% of 13)
# SP4 (12g): 1 (8% of 12)
# FOXN2 (12g): 1 (8% of 12)
# GTF2IRD1 (11g): 3 (27% of 11)
# EXO5 (11g): 3 (27% of 11)
# MAFK (11g): 3 (27% of 11)
# RFX7 (10g): 1 (10% of 10)
# SMAD3 (9g): 1 (11% of 9)
save(AUC_MH, file="int/cellsAUC_MH.RData")
AUC_HM <- AUCell.calcAUC(tfModulesH_M, cellsRankings_mouse, nCores=10)
# Using 10 cores.
# Genes in the gene sets NOT available in the dataset:
# Egr1 (781g): 24 (3% of 781)
# Egr3 (518g): 14 (3% of 518)
# Hlf (293g): 3 (1% of 293)
# Mga (221g): 7 (3% of 221)
# Zmat4 (188g): 2 (1% of 188)
# Klf13 (169g): 2 (1% of 169)
# Celf4 (120g): 2 (2% of 120)
# Nfe2l1 (81g): 2 (2% of 81)
# Hnf4a (87g): 17 (20% of 87)
# Esrrg (76g): 2 (3% of 76)
# Hnf4g (75g): 9 (12% of 75)
# Prdm16 (67g): 5 (7% of 67)
# Arid3a (67g): 6 (9% of 67)
# Egr4 (77g): 10 (13% of 77)
# Ppargc1a (64g): 2 (3% of 64)
# Vdr (59g): 3 (5% of 59)
# Melk (59g): 1 (2% of 59)
# Nkx2-5 (57g): 8 (14% of 57)
# Spi1 (60g): 4 (7% of 60)
# Dlx2 (58g): 2 (3% of 58)
# Mafb (57g): 2 (4% of 57)
# Nr2c2 (53g): 2 (4% of 53)
# Tef (65g): 13 (20% of 65)
# Ar (47g): 2 (4% of 47)
# Pgr (51g): 12 (24% of 51)
# Gtf3c2 (40g): 1 (2% of 40)
# Runx1 (38g): 1 (3% of 38)
# Clock (39g): 3 (8% of 39)
# Trp53 (43g): 10 (23% of 43)
# Rel (34g): 2 (6% of 34)
# Bcl6 (32g): 1 (3% of 32)
# Gata3 (36g): 3 (8% of 36)
# Nr1h3 (34g): 3 (9% of 34)
# Yy1 (28g): 1 (4% of 28)
# Irf2 (30g): 3 (10% of 30)
# Elk1 (26g): 1 (4% of 26)
# Gata2 (28g): 5 (18% of 28)
# Tead4 (26g): 1 (4% of 26)
# Gata5 (23g): 4 (17% of 23)
# Nfkb1 (23g): 1 (4% of 23)
# Gata4 (30g): 13 (43% of 30)
# Ppard (24g): 2 (8% of 24)
# Pou5f1 (23g): 2 (9% of 23)
# Foxj3 (23g): 1 (4% of 23)
# Tfcp2l1 (44g): 16 (36% of 44)
# Zfy1 (24g): 5 (21% of 24)
# Arnt (21g): 1 (5% of 21)
# Polr2a (21g): 1 (5% of 21)
# Rxrb (21g): 1 (5% of 21)
# Egr2 (32g): 9 (28% of 32)
# Elf4 (20g): 3 (15% of 20)
# Foxo3 (18g): 1 (6% of 18)
# Grhl2 (18g): 2 (11% of 18)
# Lhx4 (19g): 1 (5% of 19)
# Tbx21 (19g): 3 (16% of 19)
# Zfp777 (18g): 1 (6% of 18)
# Ctcfl (18g): 5 (28% of 18)
# Myb (18g): 1 (6% of 18)
# Pou2f1 (17g): 1 (6% of 17)
# Smad1 (17g): 1 (6% of 17)
# Snai2 (15g): 1 (7% of 15)
# Zfp354a (17g): 2 (12% of 17)
# Mecom (14g): 1 (7% of 14)
# Nfya (16g): 2 (12% of 16)
# Ssx9 (27g): 9 (33% of 27)
# Stat5a (13g): 1 (8% of 13)
# E2f2 (27g): 13 (48% of 27)
# Elf1 (16g): 1 (6% of 16)
# Srebf1 (16g): 2 (12% of 16)
# Dlx1 (13g): 1 (8% of 13)
# Esr2 (16g): 2 (12% of 16)
# Gfi1b (13g): 2 (15% of 13)
# Sirt6 (14g): 1 (7% of 14)
# Figla (9g): 2 (22% of 9)
save(AUC_HM, file="int/cellsAUC_HM.RData")
Mousemod <- data.frame(colnames(MouseAUC), colnames(AUC_MH))
colnames(Mousemod) <- c("Mouse", "Human homolog")
WriteXLS(Mousemod, ExcelFileName="int/MouseModules.xls")
Humanmod <- data.frame(colnames(HumanAUC), colnames(AUC_HM))
colnames(Humanmod) <- c("Human", "Mouse homolog")
WriteXLS(Humanmod, ExcelFileName="int/HumanModules.xls")
#4. Mouse modules plots
rownames(Mousemod) <- Mousemod[,1]
colorPal <- grDevices::colorRampPalette(c("lightgrey", "red","darkred"))
pdf("MouseModules.pdf", width=11,height=5)
par(mfrow=c(1,2))
for (name in rownames(Mousemod)){
cellColorMouse <- setNames(colorPal(5)[as.numeric(cut((MouseAUC[,name]), breaks=5, right=F,include.lowest=T))], rownames(MouseAUC))
colMouse <- adjustcolor(cellColorMouse[rownames(Mousetsne$Y)], alpha=0.5)
name_MH <- as.character(Mousemod[name, 2])
cellColorHuman <- setNames(colorPal(5)[as.numeric(cut((AUC_MH[,name_MH]), breaks=5, right=F,include.lowest=T))], rownames(AUC_MH))
colHuman <- adjustcolor(cellColorHuman[rownames(Humantsne$Y)], alpha=0.5)
plot(Mousetsne$Y, col=colMouse, pch=16, xlab="tsne1", ylab="tsne2", main=name)
plot(Humantsne$Y, col=colHuman, pch=16, xlab="tsne1", ylab="tsne2", main=name_MH)
}
dev.off()
rownames(Humanmod) <- Humanmod[,1]
pdf("HumanModules.pdf", width=11,height=5)
par(mfrow=c(1,2))
for (name in rownames(Humanmod) ){
if (sum(HumanAUC[,name]) != 0){
cellColorHuman <- setNames(colorPal(5)[as.numeric(cut((HumanAUC[,name]), breaks=5, right=F,include.lowest=T))], rownames(HumanAUC))
}
else{
cellColorHuman <- setNames(rep("grey", length(rownames(HumanAUC))), rownames(HumanAUC))
}
colHuman <- adjustcolor(cellColorHuman[rownames(Humantsne$Y)], alpha=0.5)
name_HM <- as.character(Humanmod[name, 2])
cellColorMouse <- setNames(colorPal(5)[as.numeric(cut((AUC_HM[,name_HM]), breaks=5, right=F,include.lowest=T))], rownames(AUC_HM))
colMouse <- adjustcolor(cellColorMouse[rownames(Mousetsne$Y)], alpha=0.5)
plot(Humantsne$Y, col=colHuman, pch=16, xlab="tsne1", ylab="tsne2", main=name)
plot(Mousetsne$Y, col=colMouse, pch=16, xlab="tsne1", ylab="tsne2", main=name_HM)
}
dev.off()
modules <- Humantfs[which(Humantfs %in% Homog_tfs_MH[,2])] #We can do the same in mouse
pdf("ModulesInBoth.pdf", width=11,height=11)
par(mfrow=c(2,2))
for (name in modules){
namex <- paste("^", name, sep="")
namex <- paste(namex, "")
Human_module <- colnames(HumanAUC)[grep(namex, colnames(HumanAUC))]
MH_module <- colnames(AUC_MH)[grep(namex, colnames(AUC_MH))]
name_m <- Homog_tfs_HM[name,2]
name_m <- paste("^", name_m, sep="")
name_m <- paste(name_m, "")
Mouse_module <- colnames(MouseAUC)[grep(name_m, colnames(MouseAUC))]
HM_module <- colnames(AUC_HM)[grep(name_m, colnames(AUC_HM))]
if (sum(HumanAUC[,Human_module]) != 0){
cellColorHuman <- setNames(colorPal(5)[as.numeric(cut((HumanAUC[,Human_module]), breaks=5, right=F,include.lowest=T))], rownames(HumanAUC))
}
else{
cellColorHuman <- setNames(rep("grey", length(rownames(HumanAUC))), rownames(HumanAUC))
}
colHuman <- adjustcolor(cellColorHuman[rownames(Humantsne$Y)], alpha=0.5)
cellColorMH <- setNames(colorPal(5)[as.numeric(cut((AUC_MH[,MH_module]), breaks=5, right=F,include.lowest=T))], rownames(AUC_MH))
colMH <- adjustcolor(cellColorMH[rownames(Humantsne$Y)], alpha=0.5)
cellColorMouse <- setNames(colorPal(5)[as.numeric(cut((MouseAUC[,Mouse_module]), breaks=5, right=F,include.lowest=T))], rownames(MouseAUC))
colMouse <- adjustcolor(cellColorMouse[rownames(Mousetsne$Y)], alpha=0.5)
cellColorHM <- setNames(colorPal(5)[as.numeric(cut((AUC_HM[,HM_module]), breaks=5, right=F,include.lowest=T))], rownames(AUC_HM))
colHM <- adjustcolor(cellColorHM[rownames(Mousetsne$Y)], alpha=0.5)
plot(Humantsne$Y, col=colHuman, pch=16, xlab="tsne1", ylab="tsne2", main=Human_module)
plot(Mousetsne$Y, col=colHM, pch=16, xlab="tsne1", ylab="tsne2", main=HM_module)
plot(Mousetsne$Y, col=colMouse, pch=16, xlab="tsne1", ylab="tsne2", main=Mouse_module)
plot(Humantsne$Y, col=colMH, pch=16, xlab="tsne1", ylab="tsne2", main=MH_module)
}
dev.off()
Hi @SeppeDeWinter,
Thank you for the prompt reply. I have three questions following from this regarding the gene conversion for pyscenic outputs and procedures followed for joint clustering. Question 3 is something I wished to test so I understand if this may not possible to troubleshoot.
<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">
| | Enrichment | Enrichment | Enrichment | Enrichment | Enrichment | Enrichment | Enrichment | Enrichment -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | | AUC | NES | MotifSimilarityQvalue | OrthologousIdentity | Annotation | Context | TargetGenes | RankAtMax TF | MotifID | | | | | | | | AU041133 | dbcorrdb__MYC__ENCSR000EGJ_1__m1 | 0.06134957 | 3.06712857 | 0.000225 | 0.47081 | motif similar to transfac_pro__M06039 ('V$ZNF878_02: ZNF878'; q-value = 0.000225) which is annotated for orthologous gene ENSG00000257446 in H. sapiens (identity = 47%) | frozenset({'activating', 'weight>75.0%', 'mm9-500bp-upstream-7species.mc9nr'}) | [('Utp11l', 0.4190139865647753), ('Ppfia4', 0.59622490938284), ('Sugp2', 4.864268360304388), ('Pabpc4', 0.8218154311808581), ('Gprc5c', 0.6436159200634183), ('Rasal3', 0.7505263082542397), ('B3galt2', 1.0929702612979697), ('Usp2', 0.6181841109820803), ('Nr2c2', 0.5166169646063524), ('Ceacam1', 2.6686498467913062), ('Znrf2', 0.4630347193433489), ('Prdx6', 1.3107060488146285), ('Hrsp12', 0.5976996293267264), ('Creld1', 0.4950311219269024), ('Atf1', 0.619279007557783), ('Poglut1', 0.9216056647216148), ('Akr1b3', 0.4407775175009705), ('Pcm1', 0.4845314890416012), ('Bsg', 0.5310128429183464), ('Hmga2', 0.549179397838494), ('Esr2', 0.4999150787651524), ('Rps27a', 0.5823538269371533), ('Akap12', 0.5700757677524456), ('Ier3', 10.064569864928544), ('Egr2', 2.797270011931476), ('Stx18', 0.5949288146950027), ('Kif20a', 1.2278466610463123), ('Iqgap2', 0.6870818592561624), ('Slc37a3', 2.022582946766236), ('Nup85', 1.486792037703781), ('Mtpap', 0.514939998424916), ('Thrap3', 0.43361467678054), ('Arl2', 0.8234712758195339), ('Acvr1b', 0.6375409967168627), ('Chchd5', 0.4983861522005255), ('Trmt61a', 0.5803355011023089), ('Pigv', 0.9641329034044874), ('Ddx17', 0.5778319656378248), ('Paip2', 0.4465254749155481), ('Rgs8', 0.8680721784498453), ('Atxn1l', 0.7457446405851063), ('Entpd1', 0.5684530490183239), ('Mgmt', 0.7636867967415881), ('Wsb1', 0.574940011678569), ('9130401M01Rik', 0.7038900187998567), ('Dohh', 1.5529429460529105), ('Nudt13', 0.5750153491612069), ('Pcnxl2', 0.8940997879634238), ('4933407L21Rik', 0.9370037182704052), ('Tram1', 0.4258458247668389), ('Erlin1', 1.2351225637266905), ('Tmem219', 0.6822851670138835), ('Lmo2', 0.5983497621285764), ('Dcaf8', 0.9124871026174316), ('Mdk', 0.7585350357234537)] | 3979 AU041133 | dbcorrdb__MAX__ENCSR000ECN_1__m1 | 0.06097656 | 3.03346899 | 0.000394 | 0.47081 | motif similar to transfac_pro__M06039 ('V$ZNF878_02: ZNF878'; q-value = 0.000394) which is annotated for orthologous gene ENSG00000257446 in H. sapiens (identity = 47%) | frozenset({'activating', 'weight>75.0%', 'mm9-500bp-upstream-7species.mc9nr'}) | [('Rasal3', 0.7505263082542397), ('Utp11l', 0.4190139865647753), ('Ppfia4', 0.59622490938284), ('Sugp2', 4.864268360304388), ('Prdx6', 1.3107060488146285), ('Ceacam1', 2.6686498467913062), ('Atf1', 0.619279007557783), ('B3galt2', 1.0929702612979697), ('Pabpc4', 0.8218154311808581), ('Ier3', 10.064569864928544), ('Akap12', 0.5700757677524456), ('Acy1', 1.056774664398637), ('Pcm1', 0.4845314890416012), ('Usp2', 0.6181841109820803), ('Znrf2', 0.4630347193433489), ('Creld1', 0.4950311219269024), ('Gprc5c', 0.6436159200634183), ('Hrsp12', 0.5976996293267264)] | 794
Hi,
I don't think I've come across any threads for this. I was wondering if there are any tutorials/protocols for cross-species clustering conducted in the manuscript. I wanted to test this for multiple datasets with the same species and cross-species as well.
Please correct me if I am wrong, for this type of analysis (based on the manuscript):
If this is the procedure, I was wondering if there are any functions for steps 3:
Thanks!