oushujun / LTR_retriever

LTR_retriever is a highly accurate and sensitive program for identification of LTR retrotransposons; The LTR Assembly Index (LAI) is also included in this package.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5813529/
GNU General Public License v3.0
176 stars 40 forks source link

No tbl file issue #149

Closed wrengs closed 1 year ago

wrengs commented 1 year ago

Dear @oushujun ,

While running LTR_retriever previously successfully on multiple genomes, I am currently experiencing an issue similar to Issue #134

All seems to run fine up to a point where the following error occurs: grep: Genome_ch01-12.fasta.tbl: No such file or directory

Afterwards, no LAI is generated. Attached is the nohup file 20230309_nohup_LTR_retriever.txt

I also noticed that a RepeatMasker generated folder remained present, which was not the case for previous tried genomes.

Any suggestions how to move forward from here?

Many thanks in advance! Kind regards, Willem

oushujun commented 1 year ago

RepeatMasker is not finished in annotating your genome with the LTR library. You may annotate it your own and calculate LAI with the LAI script.

Alternatively, you may try giving less CPUs to LTR_retriever since RepeatMasker tends to overutilize CPU and may trigger job termination.

Shujun

wrengs commented 1 year ago

Dear Shujun,

Thank you for the reply. I am currently re-running LTR_retriever with less CPU. It has been running for 1-2 days and now seems to be stuck at blastn (STAT = Sl)

blastn -word_size 20 -outfmt 6 -evalue 0.0001 -num_alignments 10 -num_threads 8 -query genome_ch01-12.fasta.out.20000.LAI.LTR.fa -db genome_ch01-12.fasta.out.20000.LAI.LTR.fa -out genome_ch01-12.fasta.out.20000.LAI.LTR.ava.out

Latest file produced is genome_ch01-12.fasta.out.20000.LAI.LTR.ava.out

There are other parallel commands visible with STAT = S

perl LTR_retriever -genome genome_ch01-12.fasta -inharvest genome.fa.rawLTR.scn -threads 8
perl LAI -genome genome_ch01-12.fasta -intact genome_ch01-12.fasta.pass.list -all genome_ch01-12.fasta.out -t 8 -q -blast ./
perl Age_est.pl -RMout genome_ch01-12.fasta.out -genome genome_ch01-12.fasta -blast ./ -t 8 -iden_cut 100 -q

As it is already running with the option -q ( #112 ), I am wondering whether I need to re-run ( #109 ) the program. If so, do you have any advice for possible alternative parameters? Genome size is about 1.1 Gb and expected to contain many LTR-RT's

Thanks again!

Kind regards, Willem

oushujun commented 1 year ago

Hi Willem,

Did you see the file 20000.LAI.LTR.ava.out get any updates? With 8 cores it should take a while for a 1.1G genome

Shujun

On Tue, Mar 21, 2023 at 5:54 AM Willem van Rengs @.***> wrote:

Dear Shujun,

Thank you for the reply. I am currently re-running LTR_retriever with less CPU. It has been running for 1-2 days and now seems to be stuck at blastn (STAT = Sl)

blastn -word_size 20 -outfmt 6 -evalue 0.0001 -num_alignments 10 -num_threads 8 -query genome_ch01-12.fasta.out.20000.LAI.LTR.fa -db genome_ch01-12.fasta.out.20000.LAI.LTR.fa -out genome_ch01-12.fasta.out.20000.LAI.LTR.ava.out

Latest file produced is genome_ch01-12.fasta.out.20000.LAI.LTR.ava.out

There are other parallel commands visible with STAT = S

perl LTR_retriever -genome LA0716_ch01-12.fasta -inharvest genome.fa.rawLTR.scn -threads 8 perl LAI -genome genome_ch01-12.fasta -intact genome_ch01-12.fasta.pass.list -all genome_ch01-12.fasta.out -t 8 -q -blast ./ perl Age_est.pl -RMout genome_ch01-12.fasta.out -genome genome_ch01-12.fasta -blast ./ -t 8 -iden_cut 100 -q

As it is already running with the option -q ( #112 https://github.com/oushujun/LTR_retriever/issues/112 ), I am wondering whether I need to re-run ( #109 https://github.com/oushujun/LTR_retriever/issues/109 ) the program. If so, do you have any advice for possible alternative parameters? Genome size is about 1.1 Gb and expected to contain many LTR-RT's

Thanks again!

Kind regards, Willem

— Reply to this email directly, view it on GitHub https://github.com/oushujun/LTR_retriever/issues/149#issuecomment-1477546611, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNX4NDNHPZGXOTBXKOMP3TW5F3FPANCNFSM6AAAAAAVWGCE3U . You are receiving this because you were mentioned.Message ID: @.***>

wrengs commented 1 year ago

Dear Shujun,

The run was started on the 14th and the last update to 20000.LAI.LTR.ava.out was on the 19th.

All the best, Willem

wrengs commented 1 year ago

Dear Shujun,

In parallel to the run that I started on the 14th of March (described above), I re-started from scratch with suggested less threads using another cluster on the 17th also. The run started on the 17th finished (<5 days for 1.1G genome) without issues, whereas the run started on the 14th seems stuck still.

I guess the run (14th) might have been stopped due to intensive usage of that specific cluster, in combination with possible overutilization by RepeatMasker? Alternatively, some files might have been corrupted or causing issues as due to the initial crash (RepeatMasker folder remained present). Perhaps good for future users to keep an eye on.

Anyway, thank you kindly for your responsiveness and suggestion. I will close with comment.

Kind regards, Willem