Closed nbutyrate closed 5 years ago
Hi. This error is commonly prompted when the input file is not found. Do you have the HSM6XRQW_contigs_pro.fasta
files in the same folder you are working in?
Another solution would be to use the shiny.effectR()
function. It uses a graphical user interface for the prediction of candidate effectors.
Yes the file is in the folder as its getting picked up as 'fasta.file'
is it possible to add custom motif in the shinny app?
Creating HMM profile
Working... done. Pressed and indexed 1 HMMs (1 names). Models pressed into binary file: hmmbuild.hmm.h3m SSI index for binary model file: hmmbuild.hmm.h3i Profiles (MSV part) pressed into: hmmbuild.hmm.h3f Profiles (remainder) pressed into: hmmbuild.hmm.h3p HMM profile created.
Starting HMM searches
Error: Failed to open sequence file HSM6XRQW_contigs_pro.fasta for reading
hmmsearch finished! Error in hmm.search(original.seq = "HSM6XRQW_contigs_pro.fasta", regex.seq = REGEX) : HMM failed, please supply a valid absolute path to ORFs
Thanks for the quick reply. We are not planning on adding custom effector searches to the shiny app in the near future.
Ok, the problem is the absence of an absolute path for the HSM6XRQW_contigs_pro.fasta
file. We set this as a requirement for a more reproducible manner of running HMMER.
Something that you can do is:
fasta.file <- "HSM6XRQW_contigs_pro.fasta"
fasta.file <- file.path(fasta.file)
ORF <- seqinr::read.fasta(fasta.file)
REGEX <- regex.search(ORF, motif = "custom", reg.pat = "PAAR")
candidate.rxlr <- hmm.search(original.seq = fasta.file, motif = "custom", reg.pat = "PAAR")
The second line of code will prove you with the absolute path for the FASTA file and then HMMER should be able to recognize it. Let me know if this works.
thanks for the help i tried this, and here is the current status
fasta.file <- "HSM6XRQW_contigs_pro.fasta" fasta.file <- file.path(fasta.file) ORF <- seqinr::read.fasta(fasta.file) REGEX <- regex.search(ORF, motif = "custom", reg.pat = "PAAR") candidate.rxlr <- hmm.search(original.seq = fasta.file, motif = "custom", reg.pat = "PAAR") Error in hmm.search(original.seq = fasta.file, motif = "custom", reg.pat = "PAAR") : unused arguments (motif = "custom", reg.pat = "PAAR")
Im sorry, I had a mistake on my last message (Copied the wrong test code). Replace line 5 with
candidate.rxlr <- hmm.search(original.seq = fasta.file, regex.seq = REGEX)
I tried it
fasta.file <- "HSM6XRQW_contigs_pro.fasta" fasta.file <- file.path(fasta.file) ORF <- seqinr::read.fasta(fasta.file) REGEX <- regex.search(ORF, motif = "custom", reg.pat = "PAAR") candidate.rxlr <- hmm.search(original.seq = fasta.file, regex.seq = REGEX) No alignment file is provided. Starting alignment with MAFFT.
Starting MAFFT alignment.
Executing MAFFT Please be patient MAFFT alignment finished! Starting HMM
Creating HMM profile
Working... done. Pressed and indexed 1 HMMs (1 names). Models pressed into binary file: hmmbuild.hmm.h3m SSI index for binary model file: hmmbuild.hmm.h3i Profiles (MSV part) pressed into: hmmbuild.hmm.h3f Profiles (remainder) pressed into: hmmbuild.hmm.h3p HMM profile created.
Starting HMM searches
Error: Failed to open sequence file HSM6XRQW_contigs_pro.fasta for reading
hmmsearch finished! Error in hmm.search(original.seq = fasta.file, regex.seq = REGEX) : HMM failed, please supply a valid absolute path to ORFs
Can you provide me with the prompt from fasta.file
, please?
fasta.file [1] "HSM6XRQW_contigs_pro.fasta"
Alright, you still don't have the absolute path of the file. I thought file.path()
would work.
Try this:
fasta.file <- "HSM6XRQW_contigs_pro.fasta"
fasta.file <- normalizePath(fasta.file)
ORF <- seqinr::read.fasta(fasta.file)
REGEX <- regex.search(ORF, motif = "custom", reg.pat = "PAAR")
candidate.rxlr <- hmm.search(original.seq = fasta.file, regex.seq = REGEX)
Success, just one thing to confirm, the result table looks like this
$motif.table Sequence.ID RxLR.number RxLR.position. EER.number EER.position 1 k105_2323_55 2 1005,1292 1 743
just to confirm we used PAAR, the columns in the table are labeled as RxLR?
could you please guide me how to modify the commands if i need to identify something like [S/T]xExPx[I/V]
Awesome, glad to hear it worked.
Remember to change the motif
to custom
and thereg.pat
to your regula expression in your effector.summary
command. In your case, it'd be something of the likes of:
effector.summary(candidate.rxlr, motif = "custom", reg.pat = "[s,t].e.p.[i,v]")
That regex will provide you with a sequence that has a motif that starts with S or T, followed by any letter, an E, any letter, a P, any letter, and either an I or a V. You can find more info on regular expression on R here and here
Best of luck!
Hi, Just adding here since I have a similar query. I am giving a trial with a Bacterial AA fasta file (GCA_001766235.1_ASM176623v1_protein.faa, later I will replce this with a oomeycete AA fasta file)
I have questions in these steps:
REGEX <- regex.search(ORF, motif = "custom", reg.pat = "PAAR")
# Q: if you look for only RxLR and CRN effectors do you need to provide these extra info - motif = "custom", reg.pat = "PAAR"
or we exclude that part or replace the PAAR
with somethiong else?
candidate.rxlr <- hmm.search(original.seq = fasta.file, regex.seq = REGEX)
#OK
effector.summary(candidate.rxlr)
#Q: Do we need to provide more info if we are only looking for RxLR and CRN effectors?
Hi Palc,
Q: if you look for only RxLR and CRN effectors do you need to provide these extra info - motif = "custom", reg.pat = "PAAR" or we exclude that part or replace the PAAR with somethiong else?
No, just run it with the included "CRN" or "RxLR" options.
Same answer as before
Thanks.
My contig file has 5000 sequences.
For RxLR and CRN effectors, I did the following
REGEX <- regex.search(ORF, motif='RxLR')
REGEX2 <- regex.search(ORF, motif='CRN')
For RxLR, it worked fine
candidate.rxlr <- hmm.search(original.seq = fasta.file, regex.seq = REGEX, num.threads = 16)
but for CRN, I see an error message that it needs at least 4 sequences for HMM, not sure why it generates this error.
candidate.crn <- hmm.search(original.seq = fasta.file, regex.seq = REGEX2, num.threads = 16) Error in hmm.search(original.seq = fasta.file, regex.seq = REGEX2, num.threads = 16) : Not enough sequences for HMM step. At least 4 sequences are required.
Thanks for the info.
How many sequences are reported for the REGEX2 object? If you have less than 4 sequences then you cannot build the alignment via hmmer.
On Wed, Jul 31, 2019 at 11:36 PM Chandan Pal notifications@github.com wrote:
Thanks. My contig has 5000 sequences. For RxLR and CRN effectors, I did the following REGEX <- regex.search(ORF, motif='RxLR') REGEX2 <- regex.search(ORF, motif='CRN')
For RxLR, it worked fine candidate.rxlr <- hmm.search(original.seq = fasta.file, regex.seq = REGEX, num.threads = 16)
but for CRN, I see an error message that it needs at least 4 sequences for HMM, not sure why. candidate.crn <- hmm.search(original.seq = fasta.file, regex.seq = REGEX2, num.threads = 16) Error in hmm.search(original.seq = fasta.file, regex.seq = REGEX2, num.threads = 16) : Not enough sequences for HMM step. At least 4 sequences are required.
— You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub https://github.com/grunwaldlab/effectR/issues/18?email_source=notifications&email_token=AAG3DET6HGFBHTBSZK4V7G3QCJ75PA5CNFSM4HL4OJZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3JPF3A#issuecomment-517141228, or mute the thread https://github.com/notifications/unsubscribe-auth/AAG3DEVZGR4JPKU4CNRM3OLQCJ75PANCNFSM4HL4OJZQ .
It seems like REGEX2 has 157 sequences.
length(REGEX)
[1] 20
> length(REGEX2)
[1] 157
Thanks for the info. Would you mind sharing a subset of your data for me to reproduce the error and find a solution? You can send it to my email.
On Thu, Aug 1, 2019 at 9:24 PM Chandan Pal notifications@github.com wrote:
It seems like REGEX2 has 157 sequences.
length(REGEX) [1] 20
length(REGEX2) [1] 157
— You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub https://github.com/grunwaldlab/effectR/issues/18?email_source=notifications&email_token=AAG3DESNGWVXXH6GEJ4VZQLQCOZGTA5CNFSM4HL4OJZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3MRG5A#issuecomment-517542772, or mute the thread https://github.com/notifications/unsubscribe-auth/AAG3DEXJ4QVX4U2O2ON64YTQCOZGTANCNFSM4HL4OJZQ .
I have sent you the file via email. Thanks for looking into it.
Hi I am trying to use effectR and getting following error
library(effectR) fasta.file <- "HSM6XRQW_contigs_pro.fasta" ORF <- seqinr::read.fasta(fasta.file) REGEX <- regex.search(ORF, motif = "custom", reg.pat = "PAAR")
candidate.paar <- hmm.search(original.seq = "HSM6XRQW_contigs_pro.fasta", regex.seq = REGEX) No alignment file is provided. Starting alignment with MAFFT.
Starting MAFFT alignment.
Executing MAFFT Please be patient MAFFT alignment finished! Starting HMM
Creating HMM profile
Working... done. Pressed and indexed 1 HMMs (1 names). Models pressed into binary file: hmmbuild.hmm.h3m SSI index for binary model file: hmmbuild.hmm.h3i Profiles (MSV part) pressed into: hmmbuild.hmm.h3f Profiles (remainder) pressed into: hmmbuild.hmm.h3p HMM profile created.
Starting HMM searches
Error: Failed to open sequence file HSM6XRQW_contigs_pro.fasta for reading
hmmsearch finished! Error in hmm.search(original.seq = "HSM6XRQW_contigs_pro.fasta", regex.seq = REGEX) : HMM failed, please supply a valid absolute path to ORFs
i have tried to use fasta.file instead of original file name, but still the same error