MGEScan-nonLTR error - Githubissues

jphruska commented 5 years ago

Hi Jeremy--

Many thanks for helping me out with the uploading of files to the Virtual Box. I've managed to figure that out.

I was doing a preliminary run for nonLTRs using MGEScan-nonLTR and received the following error:

"FATAL: Failed to open sequence database file /home/Jeremy/galaxy/tools/Pipeline/MGEScan/WORK/output/f/output1.pep Usage: hmm search [-options] Available options are: -h : help; print brief help on version"

Would you happen to have any insight on what might be causing this error?

Thanks Jack

JBerthelier commented 5 years ago

Dear Jack,

Your welcome,

Someone else reported me this same issue :

"Fatal error: Exit code 1 () FATAL: Failed to open sequence database file /home/jeremy/galaxy/tools/Pipeline/MGEScan/WORK/output/f/output1.pep Usage: hmmsearch [-options] Available options are: -h : help; print brief help on version and usage -A : sets alignment output limit to best domain alignments -E : sets E value cutoff (globE) to <= x -T : sets T bit threshold (globT) to >= x -Z : sets Z (# seqs) for E-value calculation"

I also get the same "error message" when I use this tool with the Arabidopsis genome, but at the end of the process I get the detected sequences in output.

This message is probably a false error due to the use of a previous version of the tool HMMER (see the issue #4 ).

This "error" do not interfer with the tool. If there is no sequences in the output, it is possible that MGEScan is not able to detect LINE sequences in the genome assembly.

Best,

Jérémy

jphruska commented 5 years ago

Hi Jeremy--

Great, thanks for the clarification. I indeed did get an output, so it appears to be working alright.

All the best Jack

jphruska commented 5 years ago

Hi Jeremy--

Unfortunately I've run into another error this time with MGEScan. The only difference this time around is that I used a larger dataset (141.8 MB vs 16.1 MB the first time around).

This time, I get the following error message:

"Fatal error: Exit code 1 () FATAL: Failed to open sequence database file home/jeremy/galaxy/tools/Pipeline/MGEScan/WORK/output/output1.pep Usage: hmm search [-opotions] Available options are: -h :help; "

Additionally no output is produced this time, unlike the first time around.

Any idea as to what may be causing this result? It seems unlikely that it didn't detect any LINEs, given that we expected them to be the most common TE in this genome.

Best Jack

JBerthelier commented 5 years ago

Hi Jack,

Thanks for the report, Unfortunately, I do not have any idea to fix the problem. I get the exact same "error" message with Arabidopsis genome but I obtain result in the output. For the next version of PiRATE ( for 2020 - update) we will make some tests and tried to figure out why this message is appear and test with differents dataset size to try to fixe this issue.

If the efficiency of the tool is dependent to the dataset size, you can try to cut the file in two and launch them separatly. MGEScan is not based on repetitive sequences so this method should be good.

Sorry to not be able to help you more,

Jérémy

jphruska commented 5 years ago

Hi Jeremy--

No problem, thanks for the help anyhow. Do you think it would be possible to run MGEScan outside of the virtual box and re-incorporate the output (if it runs) into PiRATE for further processing downstream (clustering, classification, etc)? Thanks Jack

jphruska commented 5 years ago

Hi Jeremy--

Just wanted to follow up on this. I got MGEScan to work by dividing up my 'genome' into two parts (54 and 88 MB in size, each). It looks like MGEScan is having an issue with larger files.

Best Jack

JBerthelier commented 5 years ago

Hi Jack,

Sorry for my delay,

Yes you can use mgscannonltr in an other linux and transfert your fasta file in the VM and after dowload the file (with the deteceted sequences) in the PiRATE-Galaxy.
about the genome size It's great that this trick (divide the genome assembly) solve your issue. We will try to understand this problem and fix this error in the new version.

Thanks for your return,

Best,

lyy005 commented 5 years ago

Hi Jeremy, Just wanted to share that I ran into the same error. I tried dividing the genome but still have that error and didn't get any output.

Best

YY

rob123king commented 5 years ago

I'm getting the error below so will try to divide and see if that works but I try to cat this file and not present (french to english translate) Fatal error: Exit code 1 ()

FATAL: Failed to open sequence database file /home/jeremy/galaxy/tools/Pipeline/MGEScan/WORK/output/f/output1.pep Usage: hmmsearch [-options] Available options are: -h : help; print brief help on version and usage -A : sets alignment output limit to best domain alignments -E : sets E value cutoff (globE) to <= x -T : sets T bit threshold (globT) to >= x -Z : sets Z (# seqs) for E-value calculation

cat: ////*.dna: Aucun fichier ou dossier de ce type

JBerthelier commented 5 years ago

Thank you Dr King for you message.

Please let me know if this trick also solved your problem. We will take this point in account for the new version.

Best regards,

JBerthelier / PiRATE

MGEScan-nonLTR error #8