Closed Janz-Sicau closed 6 months ago
Was this a surprising result? Some genomes do not have LTR elements.
Shujun
Dear shujun Thank you for your reply. I have read the contents of #135 and #139 you provided, and I think my problem does not belong to one of the above cases. So far I think I may have had a problem with the ltr_finder_paraller step, so I ran it again. I will put the run log and result file below, please help me to see where the problem is. For the convenience of uploading, I changed the "scn" suffix to "txt" so that you can view them Thank you so much! Jian
Command line: perl /wkdir/software/LTR_FINDER_parallel-master/LTR_FINDER_parallel -w 2 -D 15000 -d 1000 -L 7000 -l 100 -p 20 -threads 10 -C -M 0.9 -seq WNall.fa > WNall-ltrfinder.scn error.txt WNall-ltrfinder.txt WNall.fa.finder.combine.txt
Hello Jian,
Please test with another genome, such as Arabidopsis, to make sure the program is working properly. If it works on Arabidopsis, I'll need a small file that can reproduce your issue to debug.
Thanks, Shujun
Dear shujun Thanks to your suggestion, I tried to use the genome of Arabidopsis ColCEN, and I found that ltr_finder_parallel works properly, which is an interesting phenomenon. The genome link I used is attached below, please have time to try it out and see what the problem is, looking forward to your reply! I am also wondering what is causing this problem, I guess if my chromosome sequence is large, each chromosome size is more than 700Mb. And I also want to ask if there are many unassembled scaffold sequences in the genome file, will it also cause errors. WN genome link:https://www.ncbi.nlm.nih.gov/nuccore/JADQCU000000000 Thank you Jian
Oh I see. Yes currently the program is limited to be able deal with sequences shorter than 100MB, and I haven’t figured out why. Let me know if you find any solutions.
Shujun
On Mon, Aug 21, 2023 at 4:28 AM Janz-Sicau @.***> wrote:
Dear shujun Thanks to your suggestion, I tried to use the genome of Arabidopsis ColCEN, and I found that ltr_finder_parallel works properly, which is an interesting phenomenon. The genome link I used is attached below, please have time to try it out and see what the problem is, looking forward to your reply! I am also wondering what is causing this problem, I guess if my chromosome sequence is large, each chromosome size is more than 700Mb. And I also want to ask if there are many unassembled scaffold sequences in the genome file, will it also cause errors. WN genome link:https://www.ncbi.nlm.nih.gov/nuccore/JADQCU000000000 Thank you Jian
— Reply to this email directly, view it on GitHub https://github.com/oushujun/LTR_retriever/issues/156#issuecomment-1685884896, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNX4NEHCJ4NC4VIU4XXKNTXWML33ANCNFSM6AAAAAA3VTGZKI . You are receiving this because you commented.Message ID: @.***>
Hello shujun After changing another genome, ltr_finder_parallel and ltr_harvest will run normally. However, I encountered an error in ltr_retriever. The program is still running. I do not know whether it will affect the final result. Could you please see what the problem is. Thank you Jian
Command line : perl /wkdir/software/LTR_FINDER_parallel-master/LTR_FINDER_parallel -w 2 -D 15000 -d 1000 -L 7000 -l 100 -p 20 -t 10 -C -M 0.9 -seq Lo7.genome.fasta > Lo7.genome.fasta.finder.combine.scn gt suffixerator -db ../Lo7.genome.fasta -indexname Lo7all-index -tis -suf -lcp -des -ssp -sds -dna gt ltrharvest -index Lo7all-index -minlenltr 100 -maxlenltr 7000 -mintsd 4 -maxtsd 6 -motif TGCA -motifmis 1 -similar 90 -vic 10 -seed 20 -seqids yes > Lo7all.harvest.scn /wkdir/zhoujian/LTR_retriever-2.9.5/LTR_retriever -genome ../Lo7.genome.fasta -inharvest ../ltr_harvest/Lo7all.harvest.scn -infinder ../ltr_finder/Lo7.genome.fasta.finder.combine.scn -threads 20
error log: ##########################
##########################
Contributors: Shujun Ou, Ning Jiang
For LTR_retriever, please cite:
Ou S and Jiang N (2018). LTR_retriever: A Highly Accurate and Sensitive Program for Identification of Long Terminal Repeat Retrotransposons. Plant Physiol. 176(2): 1410-1422.
For LAI, please cite:
Ou S, Chen J, Jiang N (2018). Assessing genome assembly quality using the LTR Assembly Index (LAI). Nucleic Acids Res. 2018;46(21):e126.
Parameters: -genome ../Lo7.genome.fasta -inharvest ../ltr_harvest/Lo7all.harvest.scn -infinder ../ltr_finder/Lo7.genome.fasta.finder.combine.scn -threads 20
2023年 08月 22日 星期二 19:22:22 CST Dependency checking: All passed! 2023年 08月 22日 星期二 19:22:34 CST LTR_retriever is starting from the Init step. 2023年 08月 22日 星期二 19:23:29 CST Start to convert inputs... Total candidates: 151525 Total uniq candidates: 151525
2023年 08月 22日 星期二 19:24:27 CST Module 1: Start to clean up candidates... Sequences with 10 missing bp or 0.8 missing data rate will be discarded. Sequences containing tandem repeats will be discarded.
2023年 08月 22日 星期二 19:24:43 CST 131323 clean candidates remained
2023年 08月 22日 星期二 19:24:43 CST Modules 2-5: Start to analyze the structure of candidates... The terminal motif, TSD, boundary, orientation, age, and superfamily will be identified in this step.
awk: 致命错误:cannot open file `Lo7.genome.fasta.retriever.scn.extend.fa.rexdb.cls.tsv' for reading: 没有那个文件或目录 2023年 08月 23日 星期三 04:06:11 CST Intact LTR-RT found: 52997
2023年 08月 23日 星期三 08:50:51 CST Module 6: Start to analyze truncated LTR-RTs... Truncated LTR-RTs without the intact version will be retained in the LTR-RT library. Use -notrunc if you don't want to keep them.
2023年 08月 23日 星期三 08:50:52 CST 15579 truncated LTR-RTs found
In fact, I encountered the same error when running the test file, but in the end, the results came out, and I put the run log of the test file below text_errorlog.txt
This bug should be fixed in the latest version. Please update and try again. Please reopen the issue if you find it not fixed. Thank you!
Shujun
Hello, I have encountered the following problems when using the software, can you help me to see where the problem is? thank you
##########################
LTR_retriever v2.9.1
##########################
Contributors: Shujun Ou, Ning Jiang
For LTR_retriever, please cite:
For LAI, please cite:
Parameters: -genome Weining.genome.fa -infinder WNall-ltrfinder.scn -threads 20
2023年 08月 18日 星期五 22:10:15 CST Dependency checking: All passed! 2023年 08月 18日 星期五 22:10:39 CST LTR_retriever is starting from the Init step. 2023年 08月 18日 星期五 22:11:15 CST The longest sequence ID in the genome contains 80 characters, which is longer than the limit (15) Trying to reformat seq IDs... Attempt 1... 2023年 08月 18日 星期五 22:12:04 CST Seq ID conversion successful!
2023年 08月 18日 星期五 22:12:04 CST Start to convert inputs...
ERROR: No candidate is found in the file(s) you specified.
ltr_finder command : perl /wkdir/software/LTR_FINDER_parallel-master/LTR_FINDER_parallel -w 2 -D 15000 -d 1000 -L 7000 -l 100 -p 20 -t 10 -C -M 0.9 -seq ../Weining.genome.fa > WNall-ltrfinder.scn ltr_retriever command: nohup /wkdir/software/LTR_retriever/LTR_retriever -genome Weining.genome.fa -infinder WNall-ltrfinder.scn -threads 20 > WN_ltrretriever.file 2>&1