Simon-Coetzee / motifBreakR

A Package For Predicting The Disruptiveness Of Single Nucleotide Polymorphisms On Transcription Factor Binding Sites.
28 stars 12 forks source link

motifbreakR error SNPs from BED formatted file #41

Open Seleit opened 2 years ago

Seleit commented 2 years ago

Hello,

I am trying to use motifbreakR to annotate some SNPs for medaka fish....i am using a custom already forged BSgenome for medaka which loads fine

library(BSgenome.medaka.ens94)

read.table(snps.bedfile.nors, header = FALSE) V1 V2 V3 V4 V5 V6 1 3 31880001 31885000 chr3:31882398:T:A 0 + 2 3 31880001 31885000 chr3:31882435:C:T 0 + 3 3 31985001 31990000 chr3:31987045:G:A 0 + 4 3 32005001 32010000 chr3:32005290:C:T 0 + 5 3 32005001 32010000 chr3:32005319:C:A 0 +

snps.mb.frombed <- snps.from.file(file = snps.bedfile.nors, search.genome = BSgenome.medaka.ens94, format = "bed") Error in mapSeqlevels(sequence, seqlevelsStyle(search.genome)) : the supplied seqlevels style must be a single string In addition: Warning message: In snps.from.file(file = snps.bedfile.nors, search.genome = BSgenome.medaka.ens94, : User selected reference allele differs from the sequence in BSgenome.medaka.ens94 continuing with genome specified reference allels there are 46872 differences

help please! :)

Thank you so much Ali

Simon-Coetzee commented 2 years ago

I believe the problem here is that the interval in your bed file is 5000bp representing the SNV. So for example your first entry should be something like chr3 31882397 31882398 chr3:31882398:T:A make sure the file doesn't have a header and that all the chromosomes are named similarly like 3 or chr3 not a mix of the two. If that doesn't work perhaps you can share a bit of your bed file and we can see what else the problem could be.

Seleit commented 2 years ago

Hello Simon,

thanks a lot for your help. I attach the latest .bed file that i used here (i changed the bins from 5000bp to just 1 bp, the SNP location, and i removed the header), still did not work.

Could it also be that the .bed file and the medaka BSgenome package are incompatible? (it is an older release for the medaka BS genome, and the ATAC data is quite recent)...but i am just speculating here....you can download the BSgenome package from medaka from here http://tulab.genetics.ac.cn/medaka_omics/

snps_intercept_motifbreakrnoheader.bed.zip