Closed chenghongdeng closed 1 year ago
Hey, Yeah, I think this is just confusion about function input. ORFik uses case sensitive start codon search, so what you claim above can not happen.
startCodon = "atg" and startCodon = startDefinition(1) will never give the same answer.
So if you want only ORFs with capital letters ATG do: findORFsFasta("selected.fa", startCodon = "ATG")
Let me know if that gives you what you want :)
Hi,
I am trying to run findORFsFasta() on a local fasta file.
I loaded my fasta file into a data frame. _fastaFile <- readDNAStringSet("selected.fa") seq_name = names(fastaFile) sequence = paste(fastaFile) df <- data.frame(seq_name, sequence)
seq <- DNAStringSet(df$sequence)_ names(seq) <- df$seq_name
The I run findORFsFasta() function by following command: _orfs <- findORFsFasta( seq, startCodon = 'atg', stopCodon = "TAA|TAG|TGA", #https://rdrr.io/bioc/ORFik/src/R/find_ORFs.R longestORF = TRUE,
minimumLength = 0,
is.circular = FALSE
)_
I also tried to specify the start codon and stop codon by using the following command: startCodon = startDefinition(1) stopCodon = stopDefinition(1)
Both ways give my the same output. By taking a close look at my output file, I find that it not only recognize the ATG as the start codon, but also recognize the CTG as the start codon. The sequence highlight in color are some ORFs identified by findORFsFasta().
I am really confused right now and not sure how to solve this problem. Thanks in advance, Chenghong