oushujun / LTR_retriever

LTR_retriever is a highly accurate and sensitive program for identification of LTR retrotransposons; The LTR Assembly Index (LAI) is also included in this package.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5813529/
GNU General Public License v3.0
176 stars 40 forks source link

cp: cannot stat ‘./sample.ssrmasked.fasta.retriever.scn.adj’: No such file or directory #104

Closed Biscuite-wzy closed 2 years ago

Biscuite-wzy commented 2 years ago

Hi shujun, I have met the following error when I ran LTR_retriver. The species has two version of the genome, I used one of them. I do not know how to resolve it.

cp: cannot stat ‘./sample.ssrmasked.fasta.retriever.scn.adj’: No such file or directory CST No LTR-RT was found in your data.

Can you give some hint for fix this error?

oushujun commented 2 years ago

You may try to rerun it on both of them. Please keep inputs and outputs in different folders.

Shujun

On Wed, Aug 11, 2021 at 10:19 AM Biscuite-wzy @.***> wrote:

Hi shujun, I have met the following error when I ran LTR_retriver. The species has two version of the genome, I used one of them. I do not know how to resolve it.

cp: cannot stat ‘./sample.ssrmasked.fasta.retriever.scn.adj’: No such file or directory CST No LTR-RT was found in your data.

Can you give some hint for fix this error?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/oushujun/LTR_retriever/issues/104, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNX4NFCRGHRPCGT2AT6AFLT4KIGRANCNFSM5B6YUMXA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .

Biscuite-wzy commented 2 years ago

Isn't running separately in different folders the same as running only one?

oushujun commented 2 years ago

Yes but it can help to avoid mixing up inputs for different genomes.

On Mon, Aug 16, 2021 at 7:48 AM Biscuite-wzy @.***> wrote:

Isn't running separately in different folders the same as running only one?

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/oushujun/LTR_retriever/issues/104#issuecomment-899482233, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNX4NCZE5CJQSE53AJBWIDT5ECKLANCNFSM5B6YUMXA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .

oushujun commented 2 years ago

@Biscuite-wzy was the issue fixed?

Biscuite-wzy commented 2 years ago

Hi, I have not resolved it.

oushujun commented 2 years ago

Sorry to learn that the issue is not resolved. Can you confirm if LTR_retriever is functioning properly on your server? You may use these small files to test out: https://github.com/oushujun/EDTA/tree/master/test

If the software works but fails on your genome, I would need to know which step causes the problem. Listing files created and their sizes will be helpful.

Shujun

thecgs commented 1 year ago

Hi shujun,

I have met this error when I ran LTR_retriver, also.

The error as follow:

CST Dependency checking: All passed!
CST LTR_retriever is starting from the Init step.
CST Start to convert inputs...
                Total candidates: 3714
                Total uniq candidates: 3279
CST Module 1: Start to clean up candidates...
                Sequences with 10 missing bp or 0.8 missing data rate will be discarded.
                Sequences containing tandem repeats will be discarded.
CST 0 clean candidates remained
CST No LTR-RT was found in your data.
CST All analyses were finished!

cp: cannot stat ‘BL.top10.fa.retriever.scn.adj’: No such file or directory

My command line as follow:

LTR_retriever  -genome  ../BL.top10.fa  -infinder ../BL.top10.finder.scn  -inharvest ../BL.top10.harvest.scn -v -threads 40"

My result as follow:

(base) [chenguisen@worker03 LTR_retriever]$ ls -lh ./
总用量 161M
-rw-r--r--. 1 chenguisen develop  44M 8月  17 2022 alluniRefprexp082813.676696
-rw-r--r--. 1 chenguisen develop  20K 8月  17 2022 alluniRefprexp082813.676696.pdb
-rw-r--r--. 1 chenguisen develop  15M 8月  17 2022 alluniRefprexp082813.676696.phr
-rw-r--r--. 1 chenguisen develop 801K 8月  17 2022 alluniRefprexp082813.676696.pin
-rw-r--r--. 1 chenguisen develop 1.2M 8月  17 2022 alluniRefprexp082813.676696.pot
-rw-r--r--. 1 chenguisen develop  36M 8月  17 2022 alluniRefprexp082813.676696.psq
-rw-r--r--. 1 chenguisen develop  16K 8月  17 2022 alluniRefprexp082813.676696.ptf
-rw-r--r--. 1 chenguisen develop 401K 8月  17 2022 alluniRefprexp082813.676696.pto
lrwxrwxrwx. 1 chenguisen develop   33 8月  17 2022 BL.top10.fa -> ../../mapping/dotplot/BL.top10.fa
-rw-r--r--. 1 chenguisen develop  29M 8月  17 2022 BL.top10.fa.ltrTE.fa
-rw-r--r--. 1 chenguisen develop    0 8月  17 2022 BL.top10.fa.ltrTE.stg1
-rw-r--r--. 1 chenguisen develop    0 8月  17 2022 BL.top10.fa.nmtf.pass.list
-rw-r--r--. 1 chenguisen develop    0 8月  17 2022 BL.top10.fa.prelib
-rw-r--r--. 1 chenguisen develop 302K 8月  17 2022 BL.top10.fa.retriever.scn
-rw-r--r--. 1 chenguisen develop 186K 8月  17 2022 BL.top10.fa.retriever.scn.full
-rw-r--r--. 1 chenguisen develop 428K 8月  17 2022 BL.top10.fa.retriever.scn.list
-rw-r--r--. 1 chenguisen develop  895 8月  17 2022 LTR_retriever.e260261
-rw-r--r--. 1 chenguisen develop 1.4K 8月  17 2022 LTR_retriever.o260261
-rw-r--r--. 1 chenguisen develop 1.6M 8月  17 2022 Tpases020812DNA.676696
-rw-r--r--. 1 chenguisen develop  20K 8月  17 2022 Tpases020812DNA.676696.pdb
-rw-r--r--. 1 chenguisen develop 340K 8月  17 2022 Tpases020812DNA.676696.phr
-rw-r--r--. 1 chenguisen develop  19K 8月  17 2022 Tpases020812DNA.676696.pin
-rw-r--r--. 1 chenguisen develop  28K 8月  17 2022 Tpases020812DNA.676696.pot
-rw-r--r--. 1 chenguisen develop 1.4M 8月  17 2022 Tpases020812DNA.676696.psq
-rw-r--r--. 1 chenguisen develop  16K 8月  17 2022 Tpases020812DNA.676696.ptf
-rw-r--r--. 1 chenguisen develop 9.2K 8月  17 2022 Tpases020812DNA.676696.pto
-rw-r--r--. 1 chenguisen develop 2.0M 8月  17 2022 Tpases020812LINE.676696
-rw-r--r--. 1 chenguisen develop  20K 8月  17 2022 Tpases020812LINE.676696.pdb
-rw-r--r--. 1 chenguisen develop 306K 8月  17 2022 Tpases020812LINE.676696.phr
-rw-r--r--. 1 chenguisen develop  19K 8月  17 2022 Tpases020812LINE.676696.pin
-rw-r--r--. 1 chenguisen develop  28K 8月  17 2022 Tpases020812LINE.676696.pot
-rw-r--r--. 1 chenguisen develop 1.8M 8月  17 2022 Tpases020812LINE.676696.psq
-rw-r--r--. 1 chenguisen develop  16K 8月  17 2022 Tpases020812LINE.676696.ptf
-rw-r--r--. 1 chenguisen develop 9.2K 8月  17 2022 Tpases020812LINE.676696.pto

Can you give some hint for fix this error?

best regards! chen guisen

thecgs commented 1 year ago

Hi shujun,

I have met this error when I ran LTR_retriver, also.

The error as follow:

CST   Dependency checking: All passed!
CST   LTR_retriever is starting from the Init step.
CST   Start to convert inputs...
              Total candidates: 3714
              Total uniq candidates: 3279
CST   Module 1: Start to clean up candidates...
              Sequences with 10 missing bp or 0.8 missing data rate will be discarded.
              Sequences containing tandem repeats will be discarded.
CST   0 clean candidates remained
CST   No LTR-RT was found in your data.
CST   All analyses were finished!

cp: cannot stat ‘BL.top10.fa.retriever.scn.adj’: No such file or directory

My command line as follow:

LTR_retriever  -genome  ../BL.top10.fa  -infinder ../BL.top10.finder.scn  -inharvest ../BL.top10.harvest.scn -v -threads 40"

My result as follow:

(base) [chenguisen@worker03 LTR_retriever]$ ls -lh ./
总用量 161M
-rw-r--r--. 1 chenguisen develop  44M 8月  17 2022 alluniRefprexp082813.676696
-rw-r--r--. 1 chenguisen develop  20K 8月  17 2022 alluniRefprexp082813.676696.pdb
-rw-r--r--. 1 chenguisen develop  15M 8月  17 2022 alluniRefprexp082813.676696.phr
-rw-r--r--. 1 chenguisen develop 801K 8月  17 2022 alluniRefprexp082813.676696.pin
-rw-r--r--. 1 chenguisen develop 1.2M 8月  17 2022 alluniRefprexp082813.676696.pot
-rw-r--r--. 1 chenguisen develop  36M 8月  17 2022 alluniRefprexp082813.676696.psq
-rw-r--r--. 1 chenguisen develop  16K 8月  17 2022 alluniRefprexp082813.676696.ptf
-rw-r--r--. 1 chenguisen develop 401K 8月  17 2022 alluniRefprexp082813.676696.pto
lrwxrwxrwx. 1 chenguisen develop   33 8月  17 2022 BL.top10.fa -> ../../mapping/dotplot/BL.top10.fa
-rw-r--r--. 1 chenguisen develop  29M 8月  17 2022 BL.top10.fa.ltrTE.fa
-rw-r--r--. 1 chenguisen develop    0 8月  17 2022 BL.top10.fa.ltrTE.stg1
-rw-r--r--. 1 chenguisen develop    0 8月  17 2022 BL.top10.fa.nmtf.pass.list
-rw-r--r--. 1 chenguisen develop    0 8月  17 2022 BL.top10.fa.prelib
-rw-r--r--. 1 chenguisen develop 302K 8月  17 2022 BL.top10.fa.retriever.scn
-rw-r--r--. 1 chenguisen develop 186K 8月  17 2022 BL.top10.fa.retriever.scn.full
-rw-r--r--. 1 chenguisen develop 428K 8月  17 2022 BL.top10.fa.retriever.scn.list
-rw-r--r--. 1 chenguisen develop  895 8月  17 2022 LTR_retriever.e260261
-rw-r--r--. 1 chenguisen develop 1.4K 8月  17 2022 LTR_retriever.o260261
-rw-r--r--. 1 chenguisen develop 1.6M 8月  17 2022 Tpases020812DNA.676696
-rw-r--r--. 1 chenguisen develop  20K 8月  17 2022 Tpases020812DNA.676696.pdb
-rw-r--r--. 1 chenguisen develop 340K 8月  17 2022 Tpases020812DNA.676696.phr
-rw-r--r--. 1 chenguisen develop  19K 8月  17 2022 Tpases020812DNA.676696.pin
-rw-r--r--. 1 chenguisen develop  28K 8月  17 2022 Tpases020812DNA.676696.pot
-rw-r--r--. 1 chenguisen develop 1.4M 8月  17 2022 Tpases020812DNA.676696.psq
-rw-r--r--. 1 chenguisen develop  16K 8月  17 2022 Tpases020812DNA.676696.ptf
-rw-r--r--. 1 chenguisen develop 9.2K 8月  17 2022 Tpases020812DNA.676696.pto
-rw-r--r--. 1 chenguisen develop 2.0M 8月  17 2022 Tpases020812LINE.676696
-rw-r--r--. 1 chenguisen develop  20K 8月  17 2022 Tpases020812LINE.676696.pdb
-rw-r--r--. 1 chenguisen develop 306K 8月  17 2022 Tpases020812LINE.676696.phr
-rw-r--r--. 1 chenguisen develop  19K 8月  17 2022 Tpases020812LINE.676696.pin
-rw-r--r--. 1 chenguisen develop  28K 8月  17 2022 Tpases020812LINE.676696.pot
-rw-r--r--. 1 chenguisen develop 1.8M 8月  17 2022 Tpases020812LINE.676696.psq
-rw-r--r--. 1 chenguisen develop  16K 8月  17 2022 Tpases020812LINE.676696.ptf
-rw-r--r--. 1 chenguisen develop 9.2K 8月  17 2022 Tpases020812LINE.676696.pto

Can you give some hint for fix this error?

best regards! chen guisen

update!

This problem is caused by the fact that there is no trf software path in the path file. My RepeatMasker is installed in the conda virtual environment, but LTR_retriever will not use trf software in the RepeatMasker path. Instead, it looks up the path directly from /bin/user/env.

I hope I can help you.

oushujun commented 1 year ago

Thanks for the update. This is strange because it passed the dependency check. If there is no trf available it should yielded an error.

oushujun commented 1 year ago

Hi Guisen,

This is not an error per se, but suggests the program could not find any intact element in this genome. You may check out the defalse file for each candidates situation.

Shujun

On Wed, Aug 17, 2022 at 12:43 AM guisen chen @.***> wrote:

Hi shujun,

I have met this error when I ran LTR_retriver, also.

The error as follow:

CST Dependency checking: All passed!

CST LTR_retriever is starting from the Init step.

CST Start to convert inputs...

          Total candidates: 3714

          Total uniq candidates: 3279

CST Module 1: Start to clean up candidates...

          Sequences with 10 missing bp or 0.8 missing data rate will be discarded.

          Sequences containing tandem repeats will be discarded.

CST 0 clean candidates remained

CST No LTR-RT was found in your data.

CST All analyses were finished!

cp: cannot stat ‘BL.top10.fa.retriever.scn.adj’: No such file or directory

My command line as follow:

LTR_retriever -genome ../BL.top10.fa -infinder ../BL.top10.finder.scn -inharvest ../BL.top10.harvest.scn -v -threads 40"

My result as follow:

(base) @.*** LTR_retriever]$ ls -lh ./

总用量 161M

-rw-r--r--. 1 chenguisen develop 44M 8月 17 2022 alluniRefprexp082813.676696

-rw-r--r--. 1 chenguisen develop 20K 8月 17 2022 alluniRefprexp082813.676696.pdb

-rw-r--r--. 1 chenguisen develop 15M 8月 17 2022 alluniRefprexp082813.676696.phr

-rw-r--r--. 1 chenguisen develop 801K 8月 17 2022 alluniRefprexp082813.676696.pin

-rw-r--r--. 1 chenguisen develop 1.2M 8月 17 2022 alluniRefprexp082813.676696.pot

-rw-r--r--. 1 chenguisen develop 36M 8月 17 2022 alluniRefprexp082813.676696.psq

-rw-r--r--. 1 chenguisen develop 16K 8月 17 2022 alluniRefprexp082813.676696.ptf

-rw-r--r--. 1 chenguisen develop 401K 8月 17 2022 alluniRefprexp082813.676696.pto

lrwxrwxrwx. 1 chenguisen develop 33 8月 17 2022 BL.top10.fa -> ../../mapping/dotplot/BL.top10.fa

-rw-r--r--. 1 chenguisen develop 29M 8月 17 2022 BL.top10.fa.ltrTE.fa

-rw-r--r--. 1 chenguisen develop 0 8月 17 2022 BL.top10.fa.ltrTE.stg1

-rw-r--r--. 1 chenguisen develop 0 8月 17 2022 BL.top10.fa.nmtf.pass.list

-rw-r--r--. 1 chenguisen develop 0 8月 17 2022 BL.top10.fa.prelib

-rw-r--r--. 1 chenguisen develop 302K 8月 17 2022 BL.top10.fa.retriever.scn

-rw-r--r--. 1 chenguisen develop 186K 8月 17 2022 BL.top10.fa.retriever.scn.full

-rw-r--r--. 1 chenguisen develop 428K 8月 17 2022 BL.top10.fa.retriever.scn.list

-rw-r--r--. 1 chenguisen develop 895 8月 17 2022 LTR_retriever.e260261

-rw-r--r--. 1 chenguisen develop 1.4K 8月 17 2022 LTR_retriever.o260261

-rw-r--r--. 1 chenguisen develop 1.6M 8月 17 2022 Tpases020812DNA.676696

-rw-r--r--. 1 chenguisen develop 20K 8月 17 2022 Tpases020812DNA.676696.pdb

-rw-r--r--. 1 chenguisen develop 340K 8月 17 2022 Tpases020812DNA.676696.phr

-rw-r--r--. 1 chenguisen develop 19K 8月 17 2022 Tpases020812DNA.676696.pin

-rw-r--r--. 1 chenguisen develop 28K 8月 17 2022 Tpases020812DNA.676696.pot

-rw-r--r--. 1 chenguisen develop 1.4M 8月 17 2022 Tpases020812DNA.676696.psq

-rw-r--r--. 1 chenguisen develop 16K 8月 17 2022 Tpases020812DNA.676696.ptf

-rw-r--r--. 1 chenguisen develop 9.2K 8月 17 2022 Tpases020812DNA.676696.pto

-rw-r--r--. 1 chenguisen develop 2.0M 8月 17 2022 Tpases020812LINE.676696

-rw-r--r--. 1 chenguisen develop 20K 8月 17 2022 Tpases020812LINE.676696.pdb

-rw-r--r--. 1 chenguisen develop 306K 8月 17 2022 Tpases020812LINE.676696.phr

-rw-r--r--. 1 chenguisen develop 19K 8月 17 2022 Tpases020812LINE.676696.pin

-rw-r--r--. 1 chenguisen develop 28K 8月 17 2022 Tpases020812LINE.676696.pot

-rw-r--r--. 1 chenguisen develop 1.8M 8月 17 2022 Tpases020812LINE.676696.psq

-rw-r--r--. 1 chenguisen develop 16K 8月 17 2022 Tpases020812LINE.676696.ptf

-rw-r--r--. 1 chenguisen develop 9.2K 8月 17 2022 Tpases020812LINE.676696.pto

Can you give some hint for fix this error?

best regards! chen guisen

— Reply to this email directly, view it on GitHub https://github.com/oushujun/LTR_retriever/issues/104#issuecomment-1217455249, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNX4ND64UUIWEEB24ITD63VZRUYBANCNFSM5B6YUMXA . You are receiving this because you modified the open/close state.Message ID: @.***>

Zoe133 commented 4 months ago

Hi shujun,

I met this question when i ran LTR_retriever, too. So has it been resolved? Some of the log as follows: Fri Mar 8 18:45:39 CST 2024 Dependency checking: All passed! Fri Mar 8 18:46:29 CST 2024 LTR_retriever is starting from the Init step. Fri Mar 8 18:46:29 CST 2024 Start to convert inputs... Total candidates: 19902 Total uniq candidates: 19902

Fri Mar 8 18:46:36 CST 2024 0 clean candidates remained

cp: cannot stat 'seq.fa.retriever.scn.adj': No such file or directory Fri Mar 8 18:46:36 CST 2024 No LTR-RT was found in your data.

Fri Mar 8 18:46:36 CST 2024 All analyses were finished!

I hope someone who have solved this problem can share the causes and solutions, thanks a lot!

oushujun commented 3 months ago

Hello,

What version were you using? did you install it through conda? You can also check it using a small genome like Arabidopsis.

Thanks, Shujun

On Sun, Mar 10, 2024 at 1:00 AM Zoe133 @.***> wrote:

Hi shujun,

I met this question when i ran LTR_retriever, too. So has it been resolved? Some of the log as follows: Fri Mar 8 18:45:39 CST 2024 Dependency checking: All passed! Fri Mar 8 18:46:29 CST 2024 LTR_retriever is starting from the Init step. Fri Mar 8 18:46:29 CST 2024 Start to convert inputs... Total candidates: 19902 Total uniq candidates: 19902

Fri Mar 8 18:46:36 CST 2024 0 clean candidates remained

cp: cannot stat 'seq.fa.retriever.scn.adj': No such file or directory Fri Mar 8 18:46:36 CST 2024 No LTR-RT was found in your data.

Fri Mar 8 18:46:36 CST 2024 All analyses were finished!

I hope someone who have solved this problem can share the causes and solutions, thanks a lot!

— Reply to this email directly, view it on GitHub https://github.com/oushujun/LTR_retriever/issues/104#issuecomment-1987101459, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNX4NGGUGRK53DGDRSPKU3YXPZILAVCNFSM5B6YUMXKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOJYG4YTAMJUGU4Q . You are receiving this because you modified the open/close state.Message ID: @.***>

Zoe133 commented 3 months ago

Hello, What version were you using? did you install it through conda? You can also check it using a small genome like Arabidopsis. Thanks, Shujun On Sun, Mar 10, 2024 at 1:00 AM Zoe133 @.> wrote: Hi shujun, I met this question when i ran LTR_retriever, too. So has it been resolved? Some of the log as follows: Fri Mar 8 18:45:39 CST 2024 Dependency checking: All passed! Fri Mar 8 18:46:29 CST 2024 LTR_retriever is starting from the Init step. Fri Mar 8 18:46:29 CST 2024 Start to convert inputs... Total candidates: 19902 Total uniq candidates: 19902 Fri Mar 8 18:46:36 CST 2024 0 clean candidates remained cp: cannot stat 'seq.fa.retriever.scn.adj': No such file or directory Fri Mar 8 18:46:36 CST 2024 No LTR-RT was found in your data. Fri Mar 8 18:46:36 CST 2024 All analyses were finished! I hope someone who have solved this problem can share the causes and solutions, thanks a lot! — Reply to this email directly, view it on GitHub <#104 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNX4NGGUGRK53DGDRSPKU3YXPZILAVCNFSM5B6YUMXKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOJYG4YTAMJUGU4Q . You are receiving this because you modified the open/close state.Message ID: @.>

Thank you for your reply!

I installed LTR_retriever version 2.9.9 manually. I have ran LTR_retriever using Arabidopsis thalianathe, and the LTR_retriever.log was the same:

Parameters: -repeatmasker /home/program/RepeatMasker -blastplus /home/program/rmblast-2.14.0/bin -cdhit_path /home/program/cd-hit-v4.8.1-2019-0228 -trf_path /home/program/trf -genome seq.fa -inharvest /home/Identify_repeats/test/test-out/RM_3968110.MonMar111859002024/LTR_216630.TueMar120050012024/raw-struct-results.txt -noanno -threads 40

Tue Mar 12 00:52:41 CST 2024 Dependency checking: All passed! Tue Mar 12 00:52:52 CST 2024 LTR_retriever is starting from the Init step. Tue Mar 12 00:52:52 CST 2024 Start to convert inputs... Total candidates: 1787 Total uniq candidates: 1787

Tue Mar 12 00:52:53 CST 2024 Module 1: Start to clean up candidates... Sequences with 10 missing bp or 0.8 missing data rate will be discarded. Sequences containing tandem repeats will be discarded.

    Usage: perl cleanup.pl -f sample.fa [options] > sample.cln.fa 
Options:
    -misschar   n   Define the letter representing unknown sequences; case insensitive; default: n
    -Nscreen    [0|1]   Enable (1) or disable (0) the -nc parameter; default: 1
    -nc     [int]   Ambuguous sequence len cutoff; discard the entire sequence if > this number; default: 0
    -nr     [0-1]   Ambuguous sequence percentage cutoff; discard the entire sequence if > this number; default: 1
    -minlen     [int]   Minimum sequence length filter after clean up; default: 100 (bp)
    -cleanN     [0|1]   Retain (0) or remove (1) the -misschar taget in output sequence; default: 0
    -trf        [0|1]   Enable (1) or disable (0) tandem repeat finder (trf); default: 1
    -trf_path   path    Path to the trf program

Tue Mar 12 00:52:53 CST 2024 0 clean candidates remained

cp: cannot stat 'seq.fa.retriever.scn.adj': No such file or directory Tue Mar 12 00:52:54 CST 2024 No LTR-RT was found in your data.

Tue Mar 12 00:52:54 CST 2024 All analyses were finished!

Actually the code i ran was: nohup RepeatModeler -threads 40 -database ../Arabidopsis -LTRStruct > out.log &. I don't know what 's the problem, could you give me some advice?I would appreciate it very much!

oushujun commented 3 months ago

That seems to be the problem of RepeatModeler. Please contact their authors for support. Visiting their issues may be helpful. You may try to install LTR_retriever over conda.

Thanks! Shujun

JiyangChang commented 3 weeks ago

Hi Shujun,

I installed the latest version of LTR_retriever via bioconda and still have this problem...

Just an update, it's wired that I ran the program three months ago and it works great, but now it doesn't work on the new genome (all the parameters are the same) ... I checked the link above but still can't find a solution...

############################
### LTR_retriever v2.9.9 ###
############################

Contributors: Shujun Ou, Ning Jiang

For LTR_retriever, please cite:

    Ou S and Jiang N (2018). LTR_retriever: A Highly Accurate and Sensitive Program for Identification of Long Terminal Repeat Retrotransposons. Plant Physiol. 176(2): 1410-1422.

For LAI, please cite:

    Ou S, Chen J, Jiang N (2018). Assessing genome assembly quality using the LTR Assembly Index (LAI). Nucleic Acids Res. 2018;46(21):e126.

Parameters: -genome aes_SubA.fa -inharvest aes_SubA.harvest_combined.scn -threads 4

Tue Jun 11 07:49:35 PM CEST 2024    Dependency checking: All passed!
Tue Jun 11 07:49:40 PM CEST 2024    LTR_retriever is starting from the Init step.
Tue Jun 11 07:49:41 PM CEST 2024    The longest sequence ID in the genome contains 17 characters, which is longer than the limit (13)
                Trying to reformat seq IDs...
                Attempt 1...
                Attempt 2...
Tue Jun 11 07:49:45 PM CEST 2024    Seq ID conversion successful!

Tue Jun 11 07:49:48 PM CEST 2024    Start to convert inputs...
                Total candidates: 11136
                Total uniq candidates: 11121

Tue Jun 11 07:49:51 PM CEST 2024    Module 1: Start to clean up candidates...
                Sequences with 10 missing bp or 0.8 missing data rate will be discarded.
                Sequences containing tandem repeats will be discarded.

        Usage: perl cleanup.pl -f sample.fa [options] > sample.cln.fa 
    Options:
        -misschar   n   Define the letter representing unknown sequences; case insensitive; default: n
        -Nscreen    [0|1]   Enable (1) or disable (0) the -nc parameter; default: 1
        -nc     [int]   Ambuguous sequence len cutoff; discard the entire sequence if > this number; default: 0
        -nr     [0-1]   Ambuguous sequence percentage cutoff; discard the entire sequence if > this number; default: 1
        -minlen     [int]   Minimum sequence length filter after clean up; default: 100 (bp)
        -cleanN     [0|1]   Retain (0) or remove (1) the -misschar taget in output sequence; default: 0
        -trf        [0|1]   Enable (1) or disable (0) tandem repeat finder (trf); default: 1
        -trf_path   path    Path to the trf program

Tue Jun 11 07:49:51 PM CEST 2024    0 clean candidates remained

cp: cannot stat 'aes_SubA.fa.mod.retriever.scn.adj': No such file or directory
Tue Jun 11 07:49:51 PM CEST 2024    No LTR-RT was found in your data.

Tue Jun 11 07:49:51 PM CEST 2024    All analyses were finished!

Hope this could be fixed soon ; )

thecgs commented 3 weeks ago

Hi shujun,

I encountered this problem again and found that the problem was that the length of the id in the genome file was greater than 13. The software converted it, but it seemed that the conversion was not correct. I rewrote this part and extended the id limit length. The problem seemed to be solved, but I don’t understand why there is a limit on the id length. Is it some dependent software that limits the id length?

Thanks! Guisen

JiyangChang commented 3 weeks ago

Hi shujun,

I encountered this problem again and found that the problem was that the length of the id in the genome file was greater than 13. The software converted it, but it seemed that the conversion was not correct. I rewrote this part and extended the id limit length. The problem seemed to be solved, but I don’t understand why there is a limit on the id length. Is it some dependent software that limits the id length?

Thanks! Guisen

Indeed, I just want to post this and it was great to see that Guisen has already reported this ID length issue ; )