Closed Neato-Nick closed 5 years ago
To see new behavior in the pull request, put the test_infestans.fasta
dataset in your working directory and perform the following:
library(seqinr)
library(effectR)
relative_path <- "test_infestans.fasta"
absolute_path <- system.file("extdata", "test_infestans.fasta", package = "effectR")
data <- read.fasta(relative_path)
regex <- regex.search(data)
hmm <- hmm.search(original.seq = absolute_path, regex.seq = regex)
hmm <- hmm.search(original.seq = relative_path, regex.seq = regex)
hmm <- hmm.search(original.seq = relative_path, regex.seq = regex, save.alignment = T)
Edit: just to be clear, I implemented solution 2 as I described above
good evening sir... this is Ramakrishna... The high number of effector proteins predicted in the HMM step is a result of the low thresholds used by our package in order to obtain as many candidate effectors as possible sir...... what is the threshold you have used in this package... low threshold means what? and one more is how did you separate the non-redundant and redundant candidates.
@Ramakrishna0007 please open a new GitHub issue for this discussion, since it's unrelated to this one. We would be happy to answer your questions there
On Wed, May 8, 2019, 6:43 AM Ramakrishna0007 notifications@github.com wrote:
good evening sir... this is Ramakrishna... The high number of effector proteins predicted in the HMM step is a result of the low thresholds used by our package in order to obtain as many candidate effectors as possible sir...... what is the threshold you have used in this package... low threshold means what? and one more is how did you separate the non-redundant and redundant candidates.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/grunwaldlab/effectR/issues/15#issuecomment-490490833, or mute the thread https://github.com/notifications/unsubscribe-auth/ABMUDUUTBHTMY2KSNWKI26LPULKJZANCNFSM4GTVAF2A .
Just started trying out the package. Very easy to use, I love it.
The first time I ran
hmm.search
on my data, MAFFT finished successfully but then hmmsearch errored out. This was my own fault - I gave a relative path instead of an absolute path to theoriginal.seq
parameter leading to my original ORFs. But because thehmm.search
function returned an error, it didn't return an object that had the finished alignment of regex candidate rxlrs. The second time around runninghmm.search
, I gave the correct path to my original ORFs, and after another round of MAFFT I got my hmm candidate RXLRsSince MAFFT doesn't need the original ORFs file, seems like users could save some time in not re-aligning their regex candidate RXLRs. A couple of ideas for solutions:
1) Before running MAFFT, validate the path given to
original.seq
actually leads to a fasta file. This is probably the easiest solution to help out people like me who make simple mistakes 2) If the MAFFT alignment succeeds when runninghmm.search
but the actual hmm search fails, maybe give a warning and return an object with only theAlignment
andREGEX
elements (and obviously without theHMM
,HMM_Table
elements)? 3) Make another separate function to call MAFFT and save it into an object or file for executing withhmm.search
. I know I could do this in the terminal itself... I've obviously got MAFFT installed so I could just run the alignment outside the R environment and use the import options you've already got. But it might be nice to integrate it all into the R session. I kind of like options 1 and 2 better than this one..