Open wangjie07070910 opened 11 months ago
Could you provide your augustus.hints.gff ? (send me a link or the file via email to katharina.hoff at uni-greifswald.de ). I will look into it, then.
I hope this problem is solved by commit https://github.com/Gaius-Augustus/GALBA/commit/d8aaf4b93738da1a9afaa1035dfbd97bd18e9227
Hi,
I have the same problem with the latest galba.sif
file.
less errors/gtf2gff.augustus.hints.gtf.stderr
In transcript g706.t1 two UTR/CDS features are overlapping. Not allowed by definition. at /opt/Augustus/scripts/gtf2gff.pl line 182, <STDIN> line 128068.
Is there a work around?
Thank you in anticipation
Best regards
Kristian
Hi all,
I also run into the same issue for 2 of my genomes that I tried to annotate using GALBA:
In transcript g411.t1 two UTR/CDS features are overlapping. Not allowed by definition. at /opt/Augustus/scripts/gtf2gff.pl line 182,
For 4 other genomes it runs just fine with exactly the same settings and protein input. I'm also wondering if there is a work around eg. removing the offending transcript from the augustus.hints.gff file.
Best Regards,
Joel
Did you pull the singularity image within the last 3 months?
Hi Katharina, Thanks for your quick reply I double checked and got this information on the build: $ singularity inspect --labels galba.sif org.label-schema.build-arch: amd64 org.label-schema.build-date: Wednesday_15_May_2024_13:19:23_CEST org.label-schema.schema-version: 1.0 org.label-schema.usage.singularity.deffile.bootstrap: docker org.label-schema.usage.singularity.deffile.from: katharinahoff/galba-notebook:latest org.label-schema.usage.singularity.version: 3.8.3
Ok, then it’s an open problem. Thank you for clarifying!
Joel Klein @.***> schrieb am Fr. 24. Mai 2024 um 17:15:
Hi Katharina, Thanks for your quick reply I double checked and got this information on the build: $ singularity inspect --labels galba.sif org.label-schema.build-arch: amd64 org.label-schema.build-date: Wednesday_15_May_2024_13:19:23_CEST org.label-schema.schema-version: 1.0 org.label-schema.usage.singularity.deffile.bootstrap: docker org.label-schema.usage.singularity.deffile.from: katharinahoff/galba-notebook:latest org.label-schema.usage.singularity.version: 3.8.3
— Reply to this email directly, view it on GitHub https://github.com/Gaius-Augustus/GALBA/issues/38#issuecomment-2129793750, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJMC6JCAPB5AYWR7JME356DZD5KQ7AVCNFSM6AAAAAA4LRCGWWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRZG44TGNZVGA . You are receiving this because you modified the open/close state.Message ID: @.***>
Dear Katharina,
Thanks for looking into it, if it helps I located the offending gene in the augustus.hints.gff file and copied the information of the 2 adjacent genes as well.
# start gene g410
CWNJ01000582 AUGUSTUS gene 76 988 0.44 + . g410
CWNJ01000582 AUGUSTUS transcript 76 988 0.44 + . g410.t1
CWNJ01000582 AUGUSTUS start_codon 76 78 . + 0 transcript_id "g410.t1"; gene_id "g410";
CWNJ01000582 AUGUSTUS initial 76 389 0.48 + 0 transcript_id "g410.t1"; gene_id "g410";
CWNJ01000582 AUGUSTUS internal 674 829 0.8 + 1 transcript_id "g410.t1"; gene_id "g410";
CWNJ01000582 AUGUSTUS terminal 937 988 0.81 + 1 transcript_id "g410.t1"; gene_id "g410";
CWNJ01000582 AUGUSTUS intron 390 673 0.85 + . transcript_id "g410.t1"; gene_id "g410";
CWNJ01000582 AUGUSTUS intron 830 936 0.8 + . transcript_id "g410.t1"; gene_id "g410";
CWNJ01000582 AUGUSTUS CDS 76 389 0.48 + 0 transcript_id "g410.t1"; gene_id "g410";
CWNJ01000582 AUGUSTUS CDS 674 829 0.8 + 1 transcript_id "g410.t1"; gene_id "g410";
CWNJ01000582 AUGUSTUS CDS 937 985 0.81 + 1 transcript_id "g410.t1"; gene_id "g410";
CWNJ01000582 AUGUSTUS stop_codon 986 988 . + 0 transcript_id "g410.t1"; gene_id "g410";
# coding sequence = [atgggttgttcatttgcagatggaatatacatgatggaagttgaccgcattctaagacctggtggttattgggtgcttt
# cgggtcctcctattggttggaaggttcattacaaagcctggcagcgatctaaggaggaccttcaggaagaacagaataagattgaagagactgctaag
# ctcctttgctgggagaaggtctctgagaagaatgaaattgccatttggcaaaagagggtagactctgtttcatgtcgtcgtagacaaatagattccag
# tgtaaaattctgcaaatcaagggatgttgatgatgtctggtataagaaaatggaggcctgcattactcctggtcctaaaggttctggtcataatctga
# aaccttttccagagaggctatatgcaatccctcctagaattgctagtggctctgctcctggagtttctgtggagacataccaggatgacaacaagaac
# tattcaatctcccaagttatgggtcatgaatgttgtgccaactattgctga]
# protein sequence = [MGCSFADGIYMMEVDRILRPGGYWVLSGPPIGWKVHYKAWQRSKEDLQEEQNKIEETAKLLCWEKVSEKNEIAIWQKR
# VDSVSCRRRQIDSSVKFCKSRDVDDVWYKKMEACITPGPKGSGHNLKPFPERLYAIPPRIASGSAPGVSVETYQDDNKNYSISQVMGHECCANYC]
# Evidence for and against this transcript:
# % of transcript supported by hints (any source): 0
# CDS exons: 0/3
# CDS introns: 0/2
# 5'UTR exons and introns: 0/0
# 3'UTR exons and introns: 0/0
# hint groups fully obeyed: 0
# incompatible hint groups: 1
# RM: 1
# end gene g410
# start gene g411
CWNJ01000583 AUGUSTUS gene 1 504 0.56 + . g411
CWNJ01000583 AUGUSTUS transcript 1 504 0.56 + . g411.t1
CWNJ01000583 AUGUSTUS terminal 1 504 0.56 + 0 transcript_id "g411.t1"; gene_id "g411";
CWNJ01000583 AUGUSTUS CDS 1 501 0.56 + 0 transcript_id "g411.t1"; gene_id "g411";
CWNJ01000583 AUGUSTUS stop_codon 502 504 . + 0 transcript_id "g411.t1"; gene_id "g411";
# coding sequence = [acaagtgaagctgtgaatgcatactattcagctgctttgatgggtatgtcatatggtgacagagaccttgttgcaattg
# gatcaacactgttagcattggaaatgaaagcagcacaaacatggtggcatgtgaaagatggggacagtaacatgtatggaaaagacttcacaaaggaa
# aacagaatagtgggaatcctgtgggctaacaagagagatagtgcactatggtgggcctcagctgagtgcagagagtgtaggcttagcattcagctatt
# gcctttgttgcctatttctgaagaactattttctaatgtggagtatgtgaagaagcttgtggaatggacagagcctgctactgaagaaggatggaagg
# gatttttgtatgcattggaagggatttatgataaagaggatgctttggagaagatcagaaagttgacagaatttgatgatggaaactcattcacaaat
# ctcttgtggtggattcatagcagagggggttga]
# protein sequence = [TSEAVNAYYSAALMGMSYGDRDLVAIGSTLLALEMKAAQTWWHVKDGDSNMYGKDFTKENRIVGILWANKRDSALWWA
# SAECRECRLSIQLLPLLPISEELFSNVEYVKKLVEWTEPATEEGWKGFLYALEGIYDKEDALEKIRKLTEFDDGNSFTNLLWWIHSRGG]
# Evidence for and against this transcript:
# % of transcript supported by hints (any source): 0
# CDS exons: 0/1
# CDS introns: 0/0
# 5'UTR exons and introns: 0/0
# 3'UTR exons and introns: 0/0
# hint groups fully obeyed: 0
# incompatible hint groups: 1
# RM: 1
# end gene g411
# start gene g412
CWNJ01000584 AUGUSTUS gene 665 1711 2.92 - . g412
CWNJ01000584 AUGUSTUS transcript 665 1711 1 - . g412.t1
CWNJ01000584 AUGUSTUS stop_codon 665 667 . - 0 transcript_id "g412.t1"; gene_id "g412";
CWNJ01000584 AUGUSTUS terminal 665 1711 1 - 0 transcript_id "g412.t1"; gene_id "g412";
CWNJ01000584 AUGUSTUS CDS 668 1711 1 - 0 transcript_id "g412.t1"; gene_id "g412";
# coding sequence = [tgcagctatggcggccacataatgccacgcccacatgataagtgtctctgctatgtcggcggcgacacccgaatccttg
# tcgttgatcggcattcctctctcaaagacctttgttcacgtctgtcttgtaccctcctccatggaaggcccttcaacctcaagtaccagctacccaat
# gaagatctcgacaatctgatatcagtttccaccgatgaagaccttgacaacatgattgaggagcatgatcgcatcactgcagctcatcctttaaaacc
# tgcacgtttgaggctttttctattcttcgataagccagagactgcagtttcaatgggttctcttttggatgattcaaagtctgaaacttggttcgtgg
# atgctcttaacaactctgggattctcccaagggttgtttcagattctgccacagtgggttgtttggtgaaccttgatggagttcttgctagtgattct
# agcaacaatttggaggctcaggctgctgagtctctggctgataacactaaacaagataagaatttgcctgatgtgcattcaatgccaaactcacctat
# ggtggagaacagttcctcatacggatcatcttcttcaaatccttcgatggccaatctgcctccaatgcggggtcgcgtcgacgagaatggtagtaggc
# tgcagcaagagcagaggcctgggatggaagagcagtttgctcaaatgacctttggtgcgaatgtgatgaaacaagatgatgggtatggtactttgtct
# gctcctatgccatcaattcctactacagttgtgacaatggcatcaccagcaattgttgctggtgataacatgaatcgggttatctcggatgacgagag
# attagatcagggagcacctgctggatatagaatgccgcctttgccattgctgcctgtgcaaccaaggactattagtggtggttttggcggaggtggag
# gctttggagctggtggcggttttagtgctggcagtggcgccggatttggtggtggagctggatatggagctggcggtggccagtga]
# protein sequence = [CSYGGHIMPRPHDKCLCYVGGDTRILVVDRHSSLKDLCSRLSCTLLHGRPFNLKYQLPNEDLDNLISVSTDEDLDNMI
# EEHDRITAAHPLKPARLRLFLFFDKPETAVSMGSLLDDSKSETWFVDALNNSGILPRVVSDSATVGCLVNLDGVLASDSSNNLEAQAAESLADNTKQD
# KNLPDVHSMPNSPMVENSSSYGSSSSNPSMANLPPMRGRVDENGSRLQQEQRPGMEEQFAQMTFGANVMKQDDGYGTLSAPMPSIPTTVVTMASPAIV
# AGDNMNRVISDDERLDQGAPAGYRMPPLPLLPVQPRTISGGFGGGGGFGAGGGFSAGSGAGFGGGAGYGAGGGQ]
# Evidence for and against this transcript:
# % of transcript supported by hints (any source): 100
# CDS exons: 1/1
# C: 1
# CDS introns: 0/0
# 5'UTR exons and introns: 0/0
# 3'UTR exons and introns: 0/0
# hint groups fully obeyed: 1
# C: 1 (250025_250025)
# incompatible hint groups: 5
# C: 1 (637779_637779)
# P: 3
# RM: 1
CWNJ01000584 AUGUSTUS transcript 665 1519 0.99 - . g412.t2
CWNJ01000584 AUGUSTUS stop_codon 665 667 . - 0 transcript_id "g412.t2"; gene_id "g412";
CWNJ01000584 AUGUSTUS single 665 1519 0.99 - 0 transcript_id "g412.t2"; gene_id "g412";
CWNJ01000584 AUGUSTUS CDS 668 1519 0.99 - 0 transcript_id "g412.t2"; gene_id "g412";
CWNJ01000584 AUGUSTUS start_codon 1517 1519 . - 0 transcript_id "g412.t2"; gene_id "g412";
# coding sequence = [ctgatatcagtttccaccgatgaagaccttgacaacatgattgaggagcatgatcgcatcactgcagctcatcctttaa
# aacctgcacgtttgaggctttttctattcttcgataagccagagactgcagtttcaatgggttctcttttggatgattcaaagtctgaaacttggttc
# gtggatgctcttaacaactctgggattctcccaagggttgtttcagattctgccacagtgggttgtttggtgaaccttgatggagttcttgctagtga
# ttctagcaacaatttggaggctcaggctgctgagtctctggctgataacactaaacaagataagaatttgcctgatgtgcattcaatgccaaactcac
# ctatggtggagaacagttcctcatacggatcatcttcttcaaatccttcgatggccaatctgcctccaatgcggggtcgcgtcgacgagaatggtagt
# aggctgcagcaagagcagaggcctgggatggaagagcagtttgctcaaatgacctttggtgcgaatgtgatgaaacaagatgatgggtatggtacttt
# gtctgctcctatgccatcaattcctactacagttgtgacaatggcatcaccagcaattgttgctggtgataacatgaatcgggttatctcggatgacg
# agagattagatcagggagcacctgctggatatagaatgccgcctttgccattgctgcctgtgcaaccaaggactattagtggtggttttggcggaggt
# ggaggctttggagctggtggcggttttagtgctggcagtggcgccggatttggtggtggagctggatatggagctggcggtggccagtga]
# protein sequence = [LISVSTDEDLDNMIEEHDRITAAHPLKPARLRLFLFFDKPETAVSMGSLLDDSKSETWFVDALNNSGILPRVVSDSAT
# VGCLVNLDGVLASDSSNNLEAQAAESLADNTKQDKNLPDVHSMPNSPMVENSSSYGSSSSNPSMANLPPMRGRVDENGSRLQQEQRPGMEEQFAQMTF
# GANVMKQDDGYGTLSAPMPSIPTTVVTMASPAIVAGDNMNRVISDDERLDQGAPAGYRMPPLPLLPVQPRTISGGFGGGGGFGAGGGFSAGSGAGFGG
# GAGYGAGGGQ]
# Evidence for and against this transcript:
# % of transcript supported by hints (any source): 100
# CDS exons: 1/1
# C: 1
# CDS introns: 0/0
# 5'UTR exons and introns: 0/0
# 3'UTR exons and introns: 0/0
# hint groups fully obeyed: 0
# incompatible hint groups: 4
# C: 2 (250025_250025,637779_637779)
# P: 2
CWNJ01000584 AUGUSTUS transcript 665 1690 0.93 - . g412.t3
CWNJ01000584 AUGUSTUS stop_codon 665 667 . - 0 transcript_id "g412.t3"; gene_id "g412";
CWNJ01000584 AUGUSTUS single 665 1690 0.93 - 0 transcript_id "g412.t3"; gene_id "g412";
CWNJ01000584 AUGUSTUS CDS 668 1690 0.93 - 0 transcript_id "g412.t3"; gene_id "g412";
CWNJ01000584 AUGUSTUS start_codon 1688 1690 . - 0 transcript_id "g412.t3"; gene_id "g412";
# coding sequence = [atgccacgcccacatgataagtgtctctgctatgtcggcggcgacacccgaatccttgtcgttgatcggcattcctctc
# tcaaagacctttgttcacgtctgtcttgtaccctcctccatggaaggcccttcaacctcaagtaccagctacccaatgaagatctcgacaatctgata
# tcagtttccaccgatgaagaccttgacaacatgattgaggagcatgatcgcatcactgcagctcatcctttaaaacctgcacgtttgaggctttttct
# attcttcgataagccagagactgcagtttcaatgggttctcttttggatgattcaaagtctgaaacttggttcgtggatgctcttaacaactctggga
# ttctcccaagggttgtttcagattctgccacagtgggttgtttggtgaaccttgatggagttcttgctagtgattctagcaacaatttggaggctcag
# gctgctgagtctctggctgataacactaaacaagataagaatttgcctgatgtgcattcaatgccaaactcacctatggtggagaacagttcctcata
# cggatcatcttcttcaaatccttcgatggccaatctgcctccaatgcggggtcgcgtcgacgagaatggtagtaggctgcagcaagagcagaggcctg
# ggatggaagagcagtttgctcaaatgacctttggtgcgaatgtgatgaaacaagatgatgggtatggtactttgtctgctcctatgccatcaattcct
# actacagttgtgacaatggcatcaccagcaattgttgctggtgataacatgaatcgggttatctcggatgacgagagattagatcagggagcacctgc
# tggatatagaatgccgcctttgccattgctgcctgtgcaaccaaggactattagtggtggttttggcggaggtggaggctttggagctggtggcggtt
# ttagtgctggcagtggcgccggatttggtggtggagctggatatggagctggcggtggccagtga]
# protein sequence = [MPRPHDKCLCYVGGDTRILVVDRHSSLKDLCSRLSCTLLHGRPFNLKYQLPNEDLDNLISVSTDEDLDNMIEEHDRIT
# AAHPLKPARLRLFLFFDKPETAVSMGSLLDDSKSETWFVDALNNSGILPRVVSDSATVGCLVNLDGVLASDSSNNLEAQAAESLADNTKQDKNLPDVH
# SMPNSPMVENSSSYGSSSSNPSMANLPPMRGRVDENGSRLQQEQRPGMEEQFAQMTFGANVMKQDDGYGTLSAPMPSIPTTVVTMASPAIVAGDNMNR
# VISDDERLDQGAPAGYRMPPLPLLPVQPRTISGGFGGGGGFGAGGGFSAGSGAGFGGGAGYGAGGGQ]
# Evidence for and against this transcript:
# % of transcript supported by hints (any source): 100
# CDS exons: 1/1
# C: 1
# CDS introns: 0/0
# 5'UTR exons and introns: 0/0
# 3'UTR exons and introns: 0/0
# hint groups fully obeyed: 0
# incompatible hint groups: 5
# C: 2 (250025_250025,637779_637779)
# P: 3
# end gene g412
Hello, I am trying this program. My commands are as follows: galba.pl --genome=${genome_file} --prot_seq=${protein_file} --threads 40
My error is as follows: ERROR in file ~/software/GALBA/scripts/galba.pl at line 5340 Failed to execute: cat augustus.hints.gff | perl -ne 'if(m/\tAUGUSTUS\t/) {print $_;}' | perl ~/software/Augustus/scripts/gtf2gff.pl --printExon --out=augustus.hints.tmp.gtf 2> errors/gtf2gff.augustus.hints.gtf.stderr
And the gtf2gff.augustus.hints.gtf.stderr shows: In transcript g86.t1 two UTR/CDS features are overlapping. Not allowed by definition. at ~/software/Augustus/scripts/gtf2gff.pl line 182, line 759036.