oushujun / LTR_retriever

LTR_retriever is a highly accurate and sensitive program for identification of LTR retrotransposons; The LTR Assembly Index (LAI) is also included in this package.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5813529/
GNU General Public License v3.0
179 stars 40 forks source link

Error with -nonTGCA option #10

Closed kaylahardwick closed 6 years ago

kaylahardwick commented 6 years ago

Hello,

I'm trying to run LTR_retriever with an LTRharvest output file, where I ran LTRharvest without the -motif TGCA option.

Here's my code:

LTR_retriever -genome ./PEP_scaffolder_300bp.fasta -nonTGCA ./PEP_scaffolder_300bp.ltrharvest.out -threads 25

and here's the error I'm getting:

##########################

LTR_retriever v1.6

##########################

Contributors: Shujun Ou, Ning Jiang

Please cite: S. Ou and N. Jiang (2017) LTR_retriever: a highly accurate and sensitive program for identification of long terminal-repeat retrotransposons. Plant Physiology, pp.01310.2017; DOI: 10.1104/pp.17.01310 Parameters: -genome ./PEP_scaffolder_300bp.fasta -nonTGCA ./PEP_scaffolder_300bp.ltrharvest.out -threads 25

Mon Jan 29 11:26:10 PST 2018 Dependency checking: All passed! Mon Jan 29 11:26:40 PST 2018 The longest sequence ID in the genome contains 23 characters, which is longer than the limit (15) Trying to reformat seq IDs... Attempt 1... Mon Jan 29 11:26:46 PST 2018 Seq ID conversion successful!

Mon Jan 29 11:26:46 PST 2018 Start to convert inputs... grep: ./PEP_scaffolder_300bp.fasta.mod.retriever.scn: No such file or directory Argument "" isn't numeric in numeric gt (>) at /mnt/lfs2/schaack/src/LTR_retriever/LTR_retriever line 327.

ERROR: No candidate is found in the file(s) you specified.

It seems like (from a cursory glance at your perl code) this error only comes up when trying to process the -inharvest file. Can you run LTR_retriever with just a -nonTGCA file for the LTR candidates, or is the -inharvest file required?

Thanks!

oushujun commented 6 years ago

Hi Kayla,

Thank you for using LTR_retriever!

To answer your question, yes, you need at least one regular input (-inharvest, -infinder, or -inmgescan) to let the program search for regular LTR first. The identification of non-TGCA LTR is an add-on function, which will try to identify more LTRs from a relaxed input (-nonTGCA). Actually the regular input will produce most of the non-TGCA LTRs (with ~99 times more TGCA LTRs of course). That means, LTR_retriever will try to identify the correct motif from inputs, so both TGCA and non-TGCA LTRs will be found. Providing the -nonTGCA input will help to find a couple more of such kinds in practice.

Hope this helps!

Best, Shujun

On Mon, Jan 29, 2018 at 2:36 PM, Kayla Hardwick notifications@github.com wrote:

Hello,

I'm trying to run LTR_retriever with an LTRharvest output file, where I ran LTRharvest without the -motif TGCA option.

Here's my code:

LTR_retriever -genome ./PEP_scaffolder_300bp.fasta -nonTGCA ./PEP_scaffolder_300bp.ltrharvest.out -threads 25

and here's the error I'm getting:

########################## LTR_retriever v1.6

##########################

Contributors: Shujun Ou, Ning Jiang

Please cite: S. Ou and N. Jiang (2017) LTR_retriever: a highly accurate and sensitive program for identification of long terminal-repeat retrotransposons. Plant Physiology, pp.01310.2017; DOI: 10.1104/pp.17.01310 Parameters: -genome ./PEP_scaffolder_300bp.fasta -nonTGCA ./PEP_scaffolder_300bp.ltrharvest.out -threads 25

Mon Jan 29 11:26:10 PST 2018 Dependency checking: All passed! Mon Jan 29 11:26:40 PST 2018 The longest sequence ID in the genome contains 23 characters, which is longer than the limit (15) Trying to reformat seq IDs... Attempt 1... Mon Jan 29 11:26:46 PST 2018 Seq ID conversion successful!

Mon Jan 29 11:26:46 PST 2018 Start to convert inputs... grep: ./PEP_scaffolder_300bp.fasta.mod.retriever.scn: No such file or directory Argument "" isn't numeric in numeric gt (>) at /mnt/lfs2/schaack/src/LTR_retriever/LTR_retriever line 327.

ERROR: No candidate is found in the file(s) you specified.

It seems like (from a cursory glance at your perl code) this error only comes up when trying to process the -inharvest file. Can you run LTR_retriever with just a -nonTGCA file for the LTR candidates, or is the -inharvest file required?

Thanks!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/oushujun/LTR_retriever/issues/10, or mute the thread https://github.com/notifications/unsubscribe-auth/AFt-NGHYKX7P4kHjOrT_JK6t6rUhopmsks5tPh23gaJpZM4RxO3b .

kaylahardwick commented 6 years ago

Great, thanks for your response! I'll run LTRharvest with the -motif TGCA option, and then try running LTR_retriever. I also wanted to mention that I found all the information in your manual about the structure of LTRs super helpful. Thanks again!