grunwaldlab / effectR

An R package to call oomycete effectors
10 stars 7 forks source link

custom pattern not giving domain positions #22

Open savipuray opened 5 years ago

savipuray commented 5 years ago

Hi,

EffectR is a brilliant package for identifying effectors. I just had a question regarding the identification of proteins with custom motifs. For some reason, when I use a custom pattern, the resulting table does not give the domain numbers and positions. I was wondering if there is some way to fix this. Thank you!

Best, Savithri

Tabima commented 5 years ago

Hi @savipuray , thanks for the comments!

Have you included the custom motif of interest in the effector.summary function?

Here's an excerpt of how to do the effector summary from the help page of the effector.summary function:

# Custom motifs
reg.pat <- "^\\w{50,60}[w,v]"
REGEX <- regex.search(sequence = ORF, motif = "custom", reg.pat = reg.pat)
candidate.custom <- hmm.search(original.seq = fasta.file, regex.seq = REGEX)
effector.summary(candidate.custom, motif = "custom", reg.pat = reg.pat)

Hope this helps!

Javier.

savipuray commented 5 years ago

Hi Javier,

Thanks for you message. I tried doing what you said. For example I used the following commands with your test data library(effectR) fasta.file <- system.file("extdata", "test_infestans.fasta", package = "effectR") ORF <- seqinr::read.fasta(fasta.file) reg.pat <- "^\w{50,60}[w,v]" REGEX <- regex.search(sequence = ORF, motif = "custom", reg.pat = reg.pat) candidate.custom <- hmm.search(original.seq = fasta.file, regex.seq = REGEX) custom.effectors <- effector.summary(candidate.custom, motif = "custom", reg.pat = reg.pat) write.table(custom.effectors[["motif.table"]], file = "example.xls", append = FALSE, quote = FALSE, sep = "\t ", eol ="\r", na = "NA", dec = ".", row.names = TRUE, col.names = TRUE,qmethod = c("escape", "double")) The custom motif position in the resulting table is a default 1 (file attached) example.txt

I tried the same with my data with the REGEX pattern "^\w{10,40}\w{1,96}Q\wLR\w{1,40}[ED][ED][RK]". It still gives '1' as the motif position and does not separate the positions and numbers of the QXLR and EER-like motifs like it does for the "RxLR" function.

Sorry for the long message, Savithri