Closed Goultard59 closed 1 year ago
If you want to reproduce Figure 6 with gkmSVM and gkmexplain importance scores, you will need to repeat the model training and gkmexplain scoring. There are several scripts used for this which are available in 'scScalpChromatin/GWAS/gkmSVM/'. In general you need to first generate null training sequences with 'gen_null_seqs.R', train gkmSVM models with 'gkmsvm_train.sh', generate SNP ref/alt sequences using 'gen_snp_seqs.R', use gkmexplain to generate base-resolution scores 'gkmsvm_explain_snps.sh', reformat scores for plotting with 'score_gkmexplain_snp_centered.R', and finally plot results with 'link_genes_gkmexplain.R'.
If you just want to plot sequence logos of particular motifs (will have no bearing on your actual data), you can use the seqlogo package and cisBP motifs like so:
# Plot all seqlogos
library(chromVARmotifs)
library(seqLogo)
library(ggseqlogo)
library(ggplot2)
library(dplyr)
library(tidyr)
# Get all cisBP motifs
data("human_pwms_v2")
cisBP_motifs <- human_pwms_v2
plotDir <- "/my/output/directory"
cisBP_dir <- paste0(plotDir, "/cisBP_human")
dir.create(cisBP_dir, showWarnings = FALSE, recursive = TRUE)
revcomp <- c(A="T",G="C",C="G",T="A")
# cisBP SeqLogos:
for(i in seq_along(cisBP_motifs)){
mt <- cisBP_motifs[[i]]
motifName <- mt@name
message(sprintf("Plotting motif %s, %s of %s", motifName, i, length(cisBP_motifs)))
# To convert PWMs back to position frequency matrices, do the reverse of what was described here:
# https://github.com/GreenleafLab/chromVARmotifs
pwm <- makePWM(exp(mt@profileMatrix)*0.25)@pwm
width <- ncol(pwm)*0.5 + 2
pdf(paste0(cisBP_dir, sprintf("/%s_seqLogo.pdf", motifName)), width=width, height=4)
# Forward
p <- (
ggseqlogo(pwm, seq_type="dna", method='bits')
+ ggtitle(motifName)
)
print(p)
# Reverse complement
rcpwm <- pwm[,c(ncol(pwm):1)]
rownames(rcpwm) <- revcomp[rownames(rcpwm)]
p <- (
ggseqlogo(rcpwm[rownames(pwm),], seq_type="dna", method='bits')
+ ggtitle(paste0(motifName, "-rev"))
)
print(p)
dev.off()
}
Hi,
I would like to reproduce Figure 6 of your paper with the visualisation of a motif in a particular peak.
Thanks for your helps, Adrien Dufour.