In transcript g86.t1 two UTR/CDS features are overlapping. Not allowed by definition. at ~/software/Augustus/scripts/gtf2gff.pl line 182, <STDIN> line 759036.

wangjie07070910 commented 11 months ago

Hello, I am trying this program. My commands are as follows: galba.pl --genome=${genome_file} --prot_seq=${protein_file} --threads 40

My error is as follows: ERROR in file ~/software/GALBA/scripts/galba.pl at line 5340 Failed to execute: cat augustus.hints.gff | perl -ne 'if(m/\tAUGUSTUS\t/) {print $_;}' | perl ~/software/Augustus/scripts/gtf2gff.pl --printExon --out=augustus.hints.tmp.gtf 2> errors/gtf2gff.augustus.hints.gtf.stderr

And the gtf2gff.augustus.hints.gtf.stderr shows: In transcript g86.t1 two UTR/CDS features are overlapping. Not allowed by definition. at ~/software/Augustus/scripts/gtf2gff.pl line 182, line 759036.

KatharinaHoff commented 11 months ago

Could you provide your augustus.hints.gff ? (send me a link or the file via email to katharina.hoff at uni-greifswald.de ). I will look into it, then.

KatharinaHoff commented 8 months ago

I hope this problem is solved by commit https://github.com/Gaius-Augustus/GALBA/commit/d8aaf4b93738da1a9afaa1035dfbd97bd18e9227

kullrich commented 5 months ago

Hi, I have the same problem with the latest galba.sif file.

less errors/gtf2gff.augustus.hints.gtf.stderr
In transcript g706.t1 two UTR/CDS features are overlapping. Not allowed by definition. at /opt/Augustus/scripts/gtf2gff.pl line 182, <STDIN> line 128068.

Is there a work around?

Thank you in anticipation

Best regards

Kristian

kleinjoel commented 3 months ago

Hi all,

I also run into the same issue for 2 of my genomes that I tried to annotate using GALBA: In transcript g411.t1 two UTR/CDS features are overlapping. Not allowed by definition. at /opt/Augustus/scripts/gtf2gff.pl line 182, line 574857.

For 4 other genomes it runs just fine with exactly the same settings and protein input. I'm also wondering if there is a work around eg. removing the offending transcript from the augustus.hints.gff file.

Best Regards,

Joel

KatharinaHoff commented 3 months ago

Did you pull the singularity image within the last 3 months?

kleinjoel commented 3 months ago

Hi Katharina, Thanks for your quick reply I double checked and got this information on the build: $ singularity inspect --labels galba.sif org.label-schema.build-arch: amd64 org.label-schema.build-date: Wednesday_15_May_2024_13:19:23_CEST org.label-schema.schema-version: 1.0 org.label-schema.usage.singularity.deffile.bootstrap: docker org.label-schema.usage.singularity.deffile.from: katharinahoff/galba-notebook:latest org.label-schema.usage.singularity.version: 3.8.3

KatharinaHoff commented 3 months ago

Ok, then it’s an open problem. Thank you for clarifying!

Joel Klein @.***> schrieb am Fr. 24. Mai 2024 um 17:15:

Hi Katharina, Thanks for your quick reply I double checked and got this information on the build: $ singularity inspect --labels galba.sif org.label-schema.build-arch: amd64 org.label-schema.build-date: Wednesday_15_May_2024_13:19:23_CEST org.label-schema.schema-version: 1.0 org.label-schema.usage.singularity.deffile.bootstrap: docker org.label-schema.usage.singularity.deffile.from: katharinahoff/galba-notebook:latest org.label-schema.usage.singularity.version: 3.8.3

— Reply to this email directly, view it on GitHub https://github.com/Gaius-Augustus/GALBA/issues/38#issuecomment-2129793750, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJMC6JCAPB5AYWR7JME356DZD5KQ7AVCNFSM6AAAAAA4LRCGWWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRZG44TGNZVGA . You are receiving this because you modified the open/close state.Message ID: @.***>

kleinjoel commented 3 months ago

Dear Katharina,

Thanks for looking into it, if it helps I located the offending gene in the augustus.hints.gff file and copied the information of the 2 adjacent genes as well.

# start gene g410
CWNJ01000582    AUGUSTUS    gene    76  988 0.44    +   .   g410
CWNJ01000582    AUGUSTUS    transcript  76  988 0.44    +   .   g410.t1
CWNJ01000582    AUGUSTUS    start_codon 76  78  .   +   0   transcript_id "g410.t1"; gene_id "g410";
CWNJ01000582    AUGUSTUS    initial 76  389 0.48    +   0   transcript_id "g410.t1"; gene_id "g410";
CWNJ01000582    AUGUSTUS    internal    674 829 0.8 +   1   transcript_id "g410.t1"; gene_id "g410";
CWNJ01000582    AUGUSTUS    terminal    937 988 0.81    +   1   transcript_id "g410.t1"; gene_id "g410";
CWNJ01000582    AUGUSTUS    intron  390 673 0.85    +   .   transcript_id "g410.t1"; gene_id "g410";
CWNJ01000582    AUGUSTUS    intron  830 936 0.8 +   .   transcript_id "g410.t1"; gene_id "g410";
CWNJ01000582    AUGUSTUS    CDS 76  389 0.48    +   0   transcript_id "g410.t1"; gene_id "g410";
CWNJ01000582    AUGUSTUS    CDS 674 829 0.8 +   1   transcript_id "g410.t1"; gene_id "g410";
CWNJ01000582    AUGUSTUS    CDS 937 985 0.81    +   1   transcript_id "g410.t1"; gene_id "g410";
CWNJ01000582    AUGUSTUS    stop_codon  986 988 .   +   0   transcript_id "g410.t1"; gene_id "g410";
# coding sequence = [atgggttgttcatttgcagatggaatatacatgatggaagttgaccgcattctaagacctggtggttattgggtgcttt
# cgggtcctcctattggttggaaggttcattacaaagcctggcagcgatctaaggaggaccttcaggaagaacagaataagattgaagagactgctaag
# ctcctttgctgggagaaggtctctgagaagaatgaaattgccatttggcaaaagagggtagactctgtttcatgtcgtcgtagacaaatagattccag
# tgtaaaattctgcaaatcaagggatgttgatgatgtctggtataagaaaatggaggcctgcattactcctggtcctaaaggttctggtcataatctga
# aaccttttccagagaggctatatgcaatccctcctagaattgctagtggctctgctcctggagtttctgtggagacataccaggatgacaacaagaac
# tattcaatctcccaagttatgggtcatgaatgttgtgccaactattgctga]
# protein sequence = [MGCSFADGIYMMEVDRILRPGGYWVLSGPPIGWKVHYKAWQRSKEDLQEEQNKIEETAKLLCWEKVSEKNEIAIWQKR
# VDSVSCRRRQIDSSVKFCKSRDVDDVWYKKMEACITPGPKGSGHNLKPFPERLYAIPPRIASGSAPGVSVETYQDDNKNYSISQVMGHECCANYC]
# Evidence for and against this transcript:
# % of transcript supported by hints (any source): 0
# CDS exons: 0/3
# CDS introns: 0/2
# 5'UTR exons and introns: 0/0
# 3'UTR exons and introns: 0/0
# hint groups fully obeyed: 0
# incompatible hint groups: 1
#     RM:   1 
# end gene g410
# start gene g411
CWNJ01000583    AUGUSTUS    gene    1   504 0.56    +   .   g411
CWNJ01000583    AUGUSTUS    transcript  1   504 0.56    +   .   g411.t1
CWNJ01000583    AUGUSTUS    terminal    1   504 0.56    +   0   transcript_id "g411.t1"; gene_id "g411";
CWNJ01000583    AUGUSTUS    CDS 1   501 0.56    +   0   transcript_id "g411.t1"; gene_id "g411";
CWNJ01000583    AUGUSTUS    stop_codon  502 504 .   +   0   transcript_id "g411.t1"; gene_id "g411";
# coding sequence = [acaagtgaagctgtgaatgcatactattcagctgctttgatgggtatgtcatatggtgacagagaccttgttgcaattg
# gatcaacactgttagcattggaaatgaaagcagcacaaacatggtggcatgtgaaagatggggacagtaacatgtatggaaaagacttcacaaaggaa
# aacagaatagtgggaatcctgtgggctaacaagagagatagtgcactatggtgggcctcagctgagtgcagagagtgtaggcttagcattcagctatt
# gcctttgttgcctatttctgaagaactattttctaatgtggagtatgtgaagaagcttgtggaatggacagagcctgctactgaagaaggatggaagg
# gatttttgtatgcattggaagggatttatgataaagaggatgctttggagaagatcagaaagttgacagaatttgatgatggaaactcattcacaaat
# ctcttgtggtggattcatagcagagggggttga]
# protein sequence = [TSEAVNAYYSAALMGMSYGDRDLVAIGSTLLALEMKAAQTWWHVKDGDSNMYGKDFTKENRIVGILWANKRDSALWWA
# SAECRECRLSIQLLPLLPISEELFSNVEYVKKLVEWTEPATEEGWKGFLYALEGIYDKEDALEKIRKLTEFDDGNSFTNLLWWIHSRGG]
# Evidence for and against this transcript:
# % of transcript supported by hints (any source): 0
# CDS exons: 0/1
# CDS introns: 0/0
# 5'UTR exons and introns: 0/0
# 3'UTR exons and introns: 0/0
# hint groups fully obeyed: 0
# incompatible hint groups: 1
#     RM:   1 
# end gene g411
# start gene g412
CWNJ01000584    AUGUSTUS    gene    665 1711    2.92    -   .   g412
CWNJ01000584    AUGUSTUS    transcript  665 1711    1   -   .   g412.t1
CWNJ01000584    AUGUSTUS    stop_codon  665 667 .   -   0   transcript_id "g412.t1"; gene_id "g412";
CWNJ01000584    AUGUSTUS    terminal    665 1711    1   -   0   transcript_id "g412.t1"; gene_id "g412";
CWNJ01000584    AUGUSTUS    CDS 668 1711    1   -   0   transcript_id "g412.t1"; gene_id "g412";
# coding sequence = [tgcagctatggcggccacataatgccacgcccacatgataagtgtctctgctatgtcggcggcgacacccgaatccttg
# tcgttgatcggcattcctctctcaaagacctttgttcacgtctgtcttgtaccctcctccatggaaggcccttcaacctcaagtaccagctacccaat
# gaagatctcgacaatctgatatcagtttccaccgatgaagaccttgacaacatgattgaggagcatgatcgcatcactgcagctcatcctttaaaacc
# tgcacgtttgaggctttttctattcttcgataagccagagactgcagtttcaatgggttctcttttggatgattcaaagtctgaaacttggttcgtgg
# atgctcttaacaactctgggattctcccaagggttgtttcagattctgccacagtgggttgtttggtgaaccttgatggagttcttgctagtgattct
# agcaacaatttggaggctcaggctgctgagtctctggctgataacactaaacaagataagaatttgcctgatgtgcattcaatgccaaactcacctat
# ggtggagaacagttcctcatacggatcatcttcttcaaatccttcgatggccaatctgcctccaatgcggggtcgcgtcgacgagaatggtagtaggc
# tgcagcaagagcagaggcctgggatggaagagcagtttgctcaaatgacctttggtgcgaatgtgatgaaacaagatgatgggtatggtactttgtct
# gctcctatgccatcaattcctactacagttgtgacaatggcatcaccagcaattgttgctggtgataacatgaatcgggttatctcggatgacgagag
# attagatcagggagcacctgctggatatagaatgccgcctttgccattgctgcctgtgcaaccaaggactattagtggtggttttggcggaggtggag
# gctttggagctggtggcggttttagtgctggcagtggcgccggatttggtggtggagctggatatggagctggcggtggccagtga]
# protein sequence = [CSYGGHIMPRPHDKCLCYVGGDTRILVVDRHSSLKDLCSRLSCTLLHGRPFNLKYQLPNEDLDNLISVSTDEDLDNMI
# EEHDRITAAHPLKPARLRLFLFFDKPETAVSMGSLLDDSKSETWFVDALNNSGILPRVVSDSATVGCLVNLDGVLASDSSNNLEAQAAESLADNTKQD
# KNLPDVHSMPNSPMVENSSSYGSSSSNPSMANLPPMRGRVDENGSRLQQEQRPGMEEQFAQMTFGANVMKQDDGYGTLSAPMPSIPTTVVTMASPAIV
# AGDNMNRVISDDERLDQGAPAGYRMPPLPLLPVQPRTISGGFGGGGGFGAGGGFSAGSGAGFGGGAGYGAGGGQ]
# Evidence for and against this transcript:
# % of transcript supported by hints (any source): 100
# CDS exons: 1/1
#      C:   1 
# CDS introns: 0/0
# 5'UTR exons and introns: 0/0
# 3'UTR exons and introns: 0/0
# hint groups fully obeyed: 1
#      C:   1 (250025_250025)
# incompatible hint groups: 5
#      C:   1 (637779_637779)
#      P:   3 
#     RM:   1 
CWNJ01000584    AUGUSTUS    transcript  665 1519    0.99    -   .   g412.t2
CWNJ01000584    AUGUSTUS    stop_codon  665 667 .   -   0   transcript_id "g412.t2"; gene_id "g412";
CWNJ01000584    AUGUSTUS    single  665 1519    0.99    -   0   transcript_id "g412.t2"; gene_id "g412";
CWNJ01000584    AUGUSTUS    CDS 668 1519    0.99    -   0   transcript_id "g412.t2"; gene_id "g412";
CWNJ01000584    AUGUSTUS    start_codon 1517    1519    .   -   0   transcript_id "g412.t2"; gene_id "g412";
# coding sequence = [ctgatatcagtttccaccgatgaagaccttgacaacatgattgaggagcatgatcgcatcactgcagctcatcctttaa
# aacctgcacgtttgaggctttttctattcttcgataagccagagactgcagtttcaatgggttctcttttggatgattcaaagtctgaaacttggttc
# gtggatgctcttaacaactctgggattctcccaagggttgtttcagattctgccacagtgggttgtttggtgaaccttgatggagttcttgctagtga
# ttctagcaacaatttggaggctcaggctgctgagtctctggctgataacactaaacaagataagaatttgcctgatgtgcattcaatgccaaactcac
# ctatggtggagaacagttcctcatacggatcatcttcttcaaatccttcgatggccaatctgcctccaatgcggggtcgcgtcgacgagaatggtagt
# aggctgcagcaagagcagaggcctgggatggaagagcagtttgctcaaatgacctttggtgcgaatgtgatgaaacaagatgatgggtatggtacttt
# gtctgctcctatgccatcaattcctactacagttgtgacaatggcatcaccagcaattgttgctggtgataacatgaatcgggttatctcggatgacg
# agagattagatcagggagcacctgctggatatagaatgccgcctttgccattgctgcctgtgcaaccaaggactattagtggtggttttggcggaggt
# ggaggctttggagctggtggcggttttagtgctggcagtggcgccggatttggtggtggagctggatatggagctggcggtggccagtga]
# protein sequence = [LISVSTDEDLDNMIEEHDRITAAHPLKPARLRLFLFFDKPETAVSMGSLLDDSKSETWFVDALNNSGILPRVVSDSAT
# VGCLVNLDGVLASDSSNNLEAQAAESLADNTKQDKNLPDVHSMPNSPMVENSSSYGSSSSNPSMANLPPMRGRVDENGSRLQQEQRPGMEEQFAQMTF
# GANVMKQDDGYGTLSAPMPSIPTTVVTMASPAIVAGDNMNRVISDDERLDQGAPAGYRMPPLPLLPVQPRTISGGFGGGGGFGAGGGFSAGSGAGFGG
# GAGYGAGGGQ]
# Evidence for and against this transcript:
# % of transcript supported by hints (any source): 100
# CDS exons: 1/1
#      C:   1 
# CDS introns: 0/0
# 5'UTR exons and introns: 0/0
# 3'UTR exons and introns: 0/0
# hint groups fully obeyed: 0
# incompatible hint groups: 4
#      C:   2 (250025_250025,637779_637779)
#      P:   2 
CWNJ01000584    AUGUSTUS    transcript  665 1690    0.93    -   .   g412.t3
CWNJ01000584    AUGUSTUS    stop_codon  665 667 .   -   0   transcript_id "g412.t3"; gene_id "g412";
CWNJ01000584    AUGUSTUS    single  665 1690    0.93    -   0   transcript_id "g412.t3"; gene_id "g412";
CWNJ01000584    AUGUSTUS    CDS 668 1690    0.93    -   0   transcript_id "g412.t3"; gene_id "g412";
CWNJ01000584    AUGUSTUS    start_codon 1688    1690    .   -   0   transcript_id "g412.t3"; gene_id "g412";
# coding sequence = [atgccacgcccacatgataagtgtctctgctatgtcggcggcgacacccgaatccttgtcgttgatcggcattcctctc
# tcaaagacctttgttcacgtctgtcttgtaccctcctccatggaaggcccttcaacctcaagtaccagctacccaatgaagatctcgacaatctgata
# tcagtttccaccgatgaagaccttgacaacatgattgaggagcatgatcgcatcactgcagctcatcctttaaaacctgcacgtttgaggctttttct
# attcttcgataagccagagactgcagtttcaatgggttctcttttggatgattcaaagtctgaaacttggttcgtggatgctcttaacaactctggga
# ttctcccaagggttgtttcagattctgccacagtgggttgtttggtgaaccttgatggagttcttgctagtgattctagcaacaatttggaggctcag
# gctgctgagtctctggctgataacactaaacaagataagaatttgcctgatgtgcattcaatgccaaactcacctatggtggagaacagttcctcata
# cggatcatcttcttcaaatccttcgatggccaatctgcctccaatgcggggtcgcgtcgacgagaatggtagtaggctgcagcaagagcagaggcctg
# ggatggaagagcagtttgctcaaatgacctttggtgcgaatgtgatgaaacaagatgatgggtatggtactttgtctgctcctatgccatcaattcct
# actacagttgtgacaatggcatcaccagcaattgttgctggtgataacatgaatcgggttatctcggatgacgagagattagatcagggagcacctgc
# tggatatagaatgccgcctttgccattgctgcctgtgcaaccaaggactattagtggtggttttggcggaggtggaggctttggagctggtggcggtt
# ttagtgctggcagtggcgccggatttggtggtggagctggatatggagctggcggtggccagtga]
# protein sequence = [MPRPHDKCLCYVGGDTRILVVDRHSSLKDLCSRLSCTLLHGRPFNLKYQLPNEDLDNLISVSTDEDLDNMIEEHDRIT
# AAHPLKPARLRLFLFFDKPETAVSMGSLLDDSKSETWFVDALNNSGILPRVVSDSATVGCLVNLDGVLASDSSNNLEAQAAESLADNTKQDKNLPDVH
# SMPNSPMVENSSSYGSSSSNPSMANLPPMRGRVDENGSRLQQEQRPGMEEQFAQMTFGANVMKQDDGYGTLSAPMPSIPTTVVTMASPAIVAGDNMNR
# VISDDERLDQGAPAGYRMPPLPLLPVQPRTISGGFGGGGGFGAGGGFSAGSGAGFGGGAGYGAGGGQ]
# Evidence for and against this transcript:
# % of transcript supported by hints (any source): 100
# CDS exons: 1/1
#      C:   1 
# CDS introns: 0/0
# 5'UTR exons and introns: 0/0
# 3'UTR exons and introns: 0/0
# hint groups fully obeyed: 0
# incompatible hint groups: 5
#      C:   2 (250025_250025,637779_637779)
#      P:   3 
# end gene g412

Gaius-Augustus / GALBA

In transcript g86.t1 two UTR/CDS features are overlapping. Not allowed by definition. at ~/software/Augustus/scripts/gtf2gff.pl line 182, <STDIN> line 759036. #38