Open omar-almolla209 opened 3 years ago
Are you sure there are no more lines before the "This is a bug" line? I would need those to locate the issue, as they describe the error context. Also, would you be OK with sharing some of your input files to help reproduce the problem? Thanks!
Thanks for the reply. To date I have solved the problem by giving ltr digest the output obtained from the EDTA 1.6 version. The above error appeared only when using the EDTA version 1.9 outputs. Unfortunately, I am not allowed to make the input files public as they are in the process of being published. Anyway with these changes everything is ok:
`tRNAs="/home/omar-almulla/Desktop/Prunus_TE_project/INPUT/Hmm_trna" proteins="/home/omar-almulla/Desktop/Prunus_TE_project/INPUT/Hmm_trna" genome='/home/omar-almulla/Desktop/Prunus_TE_project/INPUT/genomes/Prunus_avium_NCBI' EDTA_output_1_6_path="/home/omar-almulla/Desktop/Prunus_TE_project/OUTPUTS/EDTA_output/Prunus_avium_NCBI/EDTA_1.6_output/Prunus_avium_NCBI_genomic.fna.EDTA.raw" output="/home/omar-almulla/Desktop/Prunus_TE_project/OUTPUTS/LTRdigest_output/Prunus_avium_NCBI"
gt -j 4 ltrdigest -outfileprefix Prunus_avium_NCBI_ltr -trnas $tRNAs/plants-tRNAcat.fa -hmms $proteins/hmm* -seqfile $genome/Prunus_avium_NCBI_genomic.fna -matchdescstart $EDTA_output_1_6_path/Prunus_avium_NCBI_genomic.fna.LTR.intact.faSORTED.1.6.gff3 > $output/Prunus_avium_NCBI_digest.gff `
I see. I'll keep this one open but can not do much without the test data. I am unfortunately not familiar with EDTA or LTRpred but perhaps that tool creates weird GFF3 structure?
Anyway, could you please still share the line you got before the "this is a bug, please report" line, if that's OK for you? It should contain something like "Assertion failed: ..." and would at least help us place the error somewhere, and also make this issue searchable for others with a similar problem.
My script:
gt -j 4 ltrdigest -outfileprefix Prunus_avium_ltr -trnas ./INPUT/Hmm_trna/plants-tRNA_cat.fa -hmms ./INPUT/Hmm_trna/hmm_* -seqfile ./INPUT/genomes/Prunus_avium_NCBI/Prunus_avium_NCBI.fna -matchdescstart ./OUTPUTS/EDTA_output/Prunus_avium_NCBI/EDTA_1.9_output/Prunus_avium_NCBI.fna.mod.EDTA.raw/*SORTED.gff3 > Prunus_avium_digest.gff
I could not replicate the same error. Now appear:
Segmentation fault (core dumped)
CM024352.1 EDTA repeat_region 191737 201094 . ? . ID=repeat_region1;name=CM024352.1:191742..201089;classification=LTR/unknown;sequence_ontology=SO:0000657;ltr_identity=0.9959;mathod=structural;motif=TGCA;tsd=TCCAT CM024352.1 EDTA target_site_duplication 191737 191741 . ? . Parent=repeat_region1;name=CM024352.1:191742..201089;classification=LTR/unknown;sequence_ontology=SO:0000434;ltr_identity=0.9959;mathod=structural;motif=TGCA;tsd=TCCAT CM024352.1 EDTA long_terminal_repeat 191742 193449 . ? . Parent=repeat_region1;name=CM024352.1:191742..201089;classification=LTR/unknown;sequence_ontology=SO:0000286;ltr_identity=0.9959;mathod=structural;motif=TGCA;tsd=TCCAT CM024352.1 EDTA LTR_retrotransposon 191742 201089 . ? . Parent=repeat_region1;name=CM024352.1:191742..201089;classification=LTR/unknown;sequence_ontology=SO:0000186;ltr_identity=0.9959;mathod=structural;motif=TGCA;tsd=TCCAT CM024352.1 EDTA long_terminal_repeat 199383 201089 . ? . Parent=repeat_region1;name=CM024352.1:191742..201089;classification=LTR/unknown;sequence_ontology=SO:0000286;ltr_identity=0.9959;mathod=structural;motif=TGCA;tsd=TCCAT CM024352.1 EDTA target_site_duplication 201090 201094 . ? . Parent=repeat_region1;name=CM024352.1:191742..201089;classification=LTR/unknown;sequence_ontology=SO:0000434;ltr_identity=0.9959;mathod=structural;motif=TGCA;tsd=TCCAT
CM024352.1 EDTA repeat_region 1617430 1629426 . ? . ID=repeat_region2;name=CM024352.1:1617435..1629421;classification=LTR/Gypsy;sequence_ontology=SO:0000657;ltr_identity=1.0000;mathod=structural;motif=TGCA;tsd=CCAAT CM024352.1 EDTA target_site_duplication 1617430 1617434 . ? . Parent=repeat_region2;name=CM024352.1:1617435..1629421;classification=LTR/Gypsy;sequence_ontology=SO:0000434;ltr_identity=1.0000;mathod=structural;motif=TGCA;tsd=CCAAT CM024352.1 EDTA long_terminal_repeat 1617435 1619599 . ? . Parent=repeat_region2;name=CM024352.1:1617435..1629421;classification=LTR/Gypsy;sequence_ontology=SO:0000286;ltr_identity=1.0000;mathod=structural;motif=TGCA;tsd=CCAAT CM024352.1 EDTA Gypsy_LTR_retrotransposon 1617435 1629421 . ? . Parent=repeat_region2;name=CM024352.1:1617435..1629421;classification=LTR/Gypsy;sequence_ontology=SO:0002265;ltr_identity=1.0000;mathod=structural;motif=TGCA;tsd=CCAAT CM024352.1 EDTA long_terminal_repeat 1627258 1629421 . ? . Parent=repeat_region2;name=CM024352.1:1617435..1629421;classification=LTR/Gypsy;sequence_ontology=SO:0000286;ltr_identity=1.0000;mathod=structural;motif=TGCA;tsd=CCAAT CM024352.1 EDTA target_site_duplication 1629422 1629426 . ? . Parent=repeat_region2;name=CM024352.1:1617435..1629421;classification=LTR/Gypsy;sequence_ontology=SO:0000434;ltr_identity=1.0000;mathod=structural;motif=TGCA;tsd=CCAAT
CM024352.1 EDTA repeat_region 1946186 1956558 . ? . ID=repeat_region3;name=CM024352.1:1946191..1956553;classification=LTR/unknown;sequence_ontology=SO:0000657;ltr_identity=0.9991;mathod=structural;motif=TGCA;tsd=GTAAT CM024352.1 EDTA target_site_duplication 1946186 1946190 . ? . Parent=repeat_region3;name=CM024352.1:1946191..1956553;classification=LTR/unknown;sequence_ontology=SO:0000434;ltr_identity=0.9991;mathod=structural;motif=TGCA;tsd=GTAAT CM024352.1 EDTA long_terminal_repeat 1946191 1948386 . ? . Parent=repeat_region3;name=CM024352.1:1946191..1956553;classification=LTR/unknown;sequence_ontology=SO:0000286;ltr_identity=0.9991;mathod=structural;motif=TGCA;tsd=GTAAT CM024352.1 EDTA LTR_retrotransposon 1946191 1956553 . ? . Parent=repeat_region3;name=CM024352.1:1946191..1956553;classification=LTR/unknown;sequence_ontology=SO:0000186;ltr_identity=0.9991;mathod=structural;motif=TGCA;tsd=GTAAT CM024352.1 EDTA long_terminal_repeat 1954358 1956553 . ? . Parent=repeat_region3;name=CM024352.1:1946191..1956553;classification=LTR/unknown;sequence_ontology=SO:0000286;ltr_identity=0.9991;mathod=structural;motif=TGCA;tsd=GTAAT CM024352.1 EDTA target_site_duplication 1956554 1956558 . ? . Parent=repeat_region3;name=CM024352.1:1946191..1956553;classification=LTR/unknown;sequence_ontology=SO:0000434;ltr_identity=0.9991;mathod=structural;motif=TGCA;tsd=GTAAT
I am afraid the GFF3 file is not enough for me to replicate the issue, I would also need the other files (sequence FASTA and tRNA files). Basically I need a way to trigger the error on my side with your command line call. Thanks!
Problem description
While using LTRdigest this error always pops up (which also appears in R studio using ltr digest via the LTRpred package)
This is a bug, please report it at https://github.com/genometools/genometools/issues Please make sure you are running the latest release which can be found at http://genometools.org/pub/ You can check your version number with
gt -version
. Aborted (core dumped)Exact command line call triggering the problem
What GenomeTools version are you reporting an issue for (as output by
gt -version
)?gt (GenomeTools) 1.6.2 Copyright (c) 2003-2016 G. Gremme, S. Steinbiss, S. Kurtz, and CONTRIBUTORS Copyright (c) 2003-2016 Center for Bioinformatics, University of Hamburg See LICENSE file or http://genometools.org/license.html for license details.
Used compiler: cc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 Compile flags: -g -Wall -Wunused-parameter -pipe -fPIC -Wpointer-arith -Wno-unknown-pragmas -O3 -Werror
What operating system (e.g. Ubuntu, Mac OS X), OS version (e.g. 15.10, 10.11) and platform (e.g. x86_64) are you using?
Ubuntu 20.04