Closed sum732 closed 1 year ago
The gffread program has an option to convert to GTF.
https://github.com/gpertea/gffread
sum732 @.***> writes:
Hello,
Some of the tools, like Pigeon/SQANTI3 for bulk RNA ISO-Seq requires GTF file.
Can you please provide GTF version of chm13.draft_v2.0.gene_annotation.gff3 , Original link https://s3-us-west-2.amazonaws.com/human-pangenomics/T2T/CHM13/assemblies/annotation/chm13.draft_v2.0.gene_annotation.gff3`
I checked other version of GTF such as from
https://projects.ensembl.org/hprc/
and Table Browser from UCSC etc. None of them have the same left of depth of information that is present in the original GFF3. For example I cannot find following entry any of the other GTF files from other sources:chr1 CAT gene 97934895 97937928 . + . source_gene_common_name=MSTRG.282;source_gene=None;gene_biotype=StringTie;gene_id=CHM13_G0002360;gene_name=MSTRG.282;transcript_modes=exRef;ID=CHM13_G0002360;Name=MSTRG.282;source_transcript=N/A;alternative_source_transcripts=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;paralogy=N/A;unfiltered_paralogy=N/A;alignment_id=N/A;frameshift=N/A;exon_anotation_support=N/A;intron_annotation_support=N/A;transcript_class=N/A;valid_start=N/A;valid_stop=N/A;proper_orf=N/A;extra_paralog=False
I tried to convert GFF3 to GTF using
agat
, sorted it but Pigeon is not accepting it. I tried few other options but none of them are working. It would be great to have GTF version of the filechm13.draft_v2.0.gene_annotation.gff3
Many Thanks SM
-- Reply to this email directly or view it on GitHub: https://github.com/marbl/CHM13/issues/73 You are receiving this because you are subscribed to this thread.
Message ID: @.***>
Hi @diekhans, Thanks for replying.
Indeed it is one of the option, but most the details are ignored. Example here
Also in the original GFF3 there are following entries, please notice the START and END of the first 3 and next 2:
chr1 Liftoff transcript 146568094 146569221 . - . gene_name=NBPF14;source_gene=ENSG00000270629.6;gene_biotype=protein_coding;transcript_biotype=protein_coding;source_transcript=ENST00000619423.4;Name=NBPF14;source_gene_common_name=NBPF14;extra_paralog=False;gene_id=LOFF_G0000157;Parent=LOFF_G0000157;transcript_id=LOFF_T0000224;transcript_name=NBPF14-1;ID=LOFF_T0000224;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A
chr1 Liftoff exon 146568094 146569221 . - . gene_name=NBPF14;source_gene=ENSG00000270629.6;gene_biotype=protein_coding;transcript_biotype=protein_coding;source_transcript=ENST00000619423.4;Name=NBPF14;source_gene_common_name=NBPF14;extra_paralog=False;gene_id=LOFF_G0000157;transcript_id=LOFF_T0000224;transcript_name=NBPF14-0;Parent=LOFF_T0000224;ID=exon:LOFF_T0000224:0;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A
chr1 Liftoff three_prime_UTR 146568094 146569221 . - . gene_name=NBPF14;source_gene=ENSG00000270629.6;gene_biotype=protein_coding;transcript_biotype=protein_coding;source_transcript=ENST00000619423.4;Name=NBPF14;source_gene_common_name=NBPF14;extra_paralog=False;gene_id=LOFF_G0000157;transcript_id=LOFF_T0000224;transcript_name=NBPF14-0;Parent=LOFF_T0000224;ID=three_prime_UTR:LOFF_T0000224;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A
chr16 Liftoff transcript 13397099 13397303 . - . gene_name=LINC02851;source_gene=ENSG00000229611.2;gene_biotype=lncRNA;transcript_biotype=lncRNA;source_transcript=ENST00000664463.1;Name=LINC02851;source_gene_common_name=LINC02851;extra_paralog=False;gene_id=LOFF_G0001003;Parent=LOFF_G0001003;transcript_id=LOFF_T0001232;transcript_name=LINC02851-1;ID=LOFF_T0001232;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A
chr16 Liftoff exon 13397099 13397303 . - . gene_name=LINC02851;source_gene=ENSG00000229611.2;gene_biotype=lncRNA;transcript_biotype=lncRNA;source_transcript=ENST00000664463.1;Name=LINC02851;source_gene_common_name=LINC02851;extra_paralog=False;gene_id=LOFF_G0001003;transcript_id=LOFF_T0001232;transcript_name=LINC02851-0;Parent=LOFF_T0001232;ID=exon:LOFF_T0001232:0;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A
Should they be merged? If so how what biotype should be given? What else could be there that need further attention, hence the request to original authors to generate a GTF version as well.
Best Regards SM
Any software we would use would have the same issues with conversion to GTF and we don't have the resource to write another program.
I would suggest pulling the attributes from are interested in from the GFF3 and merge them back in later. The UCSC utility gff3ToGenePred can extract the attributes into an easy to parse format. gff3ToGenePred -attrsOut=some.attrs some.gff3 /dev/null
I don't understand the issue with your example. They are on different chromosomes and transcripts of different genes. Can you explain in more detail?
Note that future releases of the CHM13 gene annotations will be provider by Ensembl as part of the HPRC grant. If you need more information from them, please make request to the ensembl help desk.
There is an issue were Ensembl HPRC GFF3 were some attributes get encoded in the description. I am talking with them about address this.
sum732 @.***> writes:
Hi @diekhans, Thanks for replying.
Indeed it is one of the option, but most the details are ignored. Example here
Also in the original GFF3 there are following entries, please notice the START and END of the first 3 and next 2:
chr1 Liftoff transcript 146568094 146569221 . - . gene_name=NBPF14;source_gene=ENSG00000270629.6;gene_biotype=protein_coding;transcript_biotype=protein_coding;source_transcript=ENST00000619423.4;Name=NBPF14;source_gene_common_name=NBPF14;extra_paralog=False;gene_id=LOFF_G0000157;Parent=LOFF_G0000157;transcript_id=LOFF_T0000224;transcript_name=NBPF14-1;ID=LOFF_T0000224;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A chr1 Liftoff exon 146568094 146569221 . - . gene_name=NBPF14;source_gene=ENSG00000270629.6;gene_biotype=protein_coding;transcript_biotype=protein_coding;source_transcript=ENST00000619423.4;Name=NBPF14;source_gene_common_name=NBPF14;extra_paralog=False;gene_id=LOFF_G0000157;transcript_id=LOFF_T0000224;transcript_name=NBPF14-0;Parent=LOFF_T0000224;ID=exon:LOFF_T0000224:0;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A chr1 Liftoff three_prime_UTR 146568094 146569221 . - . gene_name=NBPF14;source_gene=ENSG00000270629.6;gene_biotype=protein_coding;transcript_biotype=protein_coding;source_transcript=ENST00000619423.4;Name=NBPF14;source_gene_common_name=NBPF14;extra_paralog=False;gene_id=LOFF_G0000157;transcript_id=LOFF_T0000224;transcript_name=NBPF14-0;Parent=LOFF_T0000224;ID=three_prime_UTR:LOFF_T0000224;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A chr16 Liftoff transcript 13397099 13397303 . - . gene_name=LINC02851;source_gene=ENSG00000229611.2;gene_biotype=lncRNA;transcript_biotype=lncRNA;source_transcript=ENST00000664463.1;Name=LINC02851;source_gene_common_name=LINC02851;extra_paralog=False;gene_id=LOFF_G0001003;Parent=LOFF_G0001003;transcript_id=LOFF_T0001232;transcript_name=LINC02851-1;ID=LOFF_T0001232;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A chr16 Liftoff exon 13397099 13397303 . - . gene_name=LINC02851;source_gene=ENSG00000229611.2;gene_biotype=lncRNA;transcript_biotype=lncRNA;source_transcript=ENST00000664463.1;Name=LINC02851;source_gene_common_name=LINC02851;extra_paralog=False;gene_id=LOFF_G0001003;transcript_id=LOFF_T0001232;transcript_name=LINC02851-0;Parent=LOFF_T0001232;ID=exon:LOFF_T0001232:0;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A
Should they be merged? If so how what biotype should be given? What else could be there that need further attention, hence the request to original authors to generate a GTF version as well.
Best Regards SM
I also appears that gffread --table option can output the attributes in an easy to parse format.
I don't understand the issue with your example. They are on different chromosomes and transcripts of different genes. Can you explain in more detail? …
Hi @diekhans , thanks for replying. The example above are two different things with the same issue. Lets take the first. Same start and end, should these be collapsed? and if so what should be the Biotype?
chr1 Liftoff transcript 146568094 146569221 . - . gene_name=NBPF14;source_gene=ENSG00000270629.6;gene_biotype=protein_coding;transcript_biotype=protein_coding;source_transcript=ENST00000619423.4;Name=NBPF14;source_gene_common_name=NBPF14;extra_paralog=False;gene_id=LOFF_G0000157;Parent=LOFF_G0000157;transcript_id=LOFF_T0000224;transcript_name=NBPF14-1;ID=LOFF_T0000224;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A
chr1 Liftoff exon 146568094 146569221 . - . gene_name=NBPF14;source_gene=ENSG00000270629.6;gene_biotype=protein_coding;transcript_biotype=protein_coding;source_transcript=ENST00000619423.4;Name=NBPF14;source_gene_common_name=NBPF14;extra_paralog=False;gene_id=LOFF_G0000157;transcript_id=LOFF_T0000224;transcript_name=NBPF14-0;Parent=LOFF_T0000224;ID=exon:LOFF_T0000224:0;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A
chr1 Liftoff three_prime_UTR 146568094 146569221 . - . gene_name=NBPF14;source_gene=ENSG00000270629.6;gene_biotype=protein_coding;transcript_biotype=protein_coding;source_transcript=ENST00000619423.4;Name=NBPF14;source_gene_common_name=NBPF14;extra_paralog=False;gene_id=LOFF_G0000157;transcript_id=LOFF_T0000224;transcript_name=NBPF14-0;Parent=LOFF_T0000224;ID=three_prime_UTR:LOFF_T0000224;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A
This is a correct representation of a bad gene annotations. It is a fragment of exon of NBPF14 that was mapped to a different due to a segments duplication.
The type field is not the biotype, it is the type of the feature. We have an transcript, which consists of an exon and the exon is all 3'UTR.
All of this is truncated part of a gene annotation.
The UCSC browser also has gene annotations from RefSeq on CHM13.
My recent experience from looking at gene annotations on CHM13 from multiple sources is they all have problems, especially with recently duplicated gene annotations.
sum732 @.***> writes:
Any software we would use would have the same issues with conversion to GTF and we don't have the resource to write another program. I would suggest pulling the attributes from are interested in from the GFF3 and merge them back in later. The UCSC utility gff3ToGenePred can extract the attributes into an easy to parse format. gff3ToGenePred -attrsOut=some.attrs some.gff3 /dev/null I don't understand the issue with your example. They are on different chromosomes and transcripts of different genes. Can you explain in more detail? Note that future releases of the CHM13 gene annotations will be provider by Ensembl as part of the HPRC grant. If you need more information from them, please make request to the ensembl help desk. There is an issue were Ensembl HPRC GFF3 were some attributes get encoded in the description. I am talking with them about address this. sum732 @.***> writes: … Hi @diekhans, Thanks for replying. Indeed it is one of the option, but most the details are ignored. Example here Also in the original GFF3 there are following entries, please notice the START and END of the first 3 and next 2:
chr1 Liftoff transcript 146568094 146569221 . - . gene_name=NBPF14;source_gene=ENSG00000270629.6;gene_biotype=protein_coding;transcript_biotype=protein_coding;source_transcript=ENST00000619423.4;Name=NBPF14;source_gene_common_name=NBPF14;extra_paralog=False;gene_id=LOFF_G0000157;Parent=LOFF_G0000157;transcript_id=LOFF_T0000224;transcript_name=NBPF14-1;ID=LOFF_T0000224;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A chr1 Liftoff exon 146568094 146569221 . - . gene_name=NBPF14;source_gene=ENSG00000270629.6;gene_biotype=protein_coding;transcript_biotype=protein_coding;source_transcript=ENST00000619423.4;Name=NBPF14;source_gene_common_name=NBPF14;extra_paralog=False;gene_id=LOFF_G0000157;transcript_id=LOFF_T0000224;transcript_name=NBPF14-0;Parent=LOFF_T0000224;ID=exon:LOFF_T0000224:0;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A chr1 Liftoff three_prime_UTR 146568094 146569221 . - . gene_name=NBPF14;source_gene=ENSG00000270629.6;gene_biotype=protein_coding;transcript_biotype=protein_coding;source_transcript=ENST00000619423.4;Name=NBPF14;source_gene_common_name=NBPF14;extra_paralog=False;gene_id=LOFF_G0000157;transcript_id=LOFF_T0000224;transcript_name=NBPF14-0;Parent=LOFF_T0000224;ID=three_prime_UTR:LOFF_T0000224;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A chr16 Liftoff transcript 13397099 13397303 . - . gene_name=LINC02851;source_gene=ENSG00000229611.2;gene_biotype=lncRNA;transcript_biotype=lncRNA;source_transcript=ENST00000664463.1;Name=LINC02851;source_gene_common_name=LINC02851;extra_paralog=False;gene_id=LOFF_G0001003;Parent=LOFF_G0001003;transcript_id=LOFF_T0001232;transcript_name=LINC02851-1;ID=LOFF_T0001232;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A chr16 Liftoff exon 13397099 13397303 . - . gene_name=LINC02851;source_gene=ENSG00000229611.2;gene_biotype=lncRNA;transcript_biotype=lncRNA;source_transcript=ENST00000664463.1;Name=LINC02851;source_gene_common_name=LINC02851;extra_paralog=False;gene_id=LOFF_G0001003;transcript_id=LOFF_T0001232;transcript_name=LINC02851-0;Parent=LOFF_T0001232;ID=exon:LOFF_T0001232:0;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A
Should they be merged? If so how what biotype should be given? What else could be there that need further attention, hence the request to original authors to generate a GTF version as well. Best Regards SMHi @diekhans , thanks for replying. The example above are two different things with the same issue. Lets take the first. Same start and end, should these be collapsed? and if so what should be the Biotype?
chr1 Liftoff transcript 146568094 146569221 . - . gene_name=NBPF14;source_gene=ENSG00000270629.6;gene_biotype=protein_coding;transcript_biotype=protein_coding;source_transcript=ENST00000619423.4;Name=NBPF14;source_gene_common_name=NBPF14;extra_paralog=False;gene_id=LOFF_G0000157;Parent=LOFF_G0000157;transcript_id=LOFF_T0000224;transcript_name=NBPF14-1;ID=LOFF_T0000224;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A chr1 Liftoff exon 146568094 146569221 . - . gene_name=NBPF14;source_gene=ENSG00000270629.6;gene_biotype=protein_coding;transcript_biotype=protein_coding;source_transcript=ENST00000619423.4;Name=NBPF14;source_gene_common_name=NBPF14;extra_paralog=False;gene_id=LOFF_G0000157;transcript_id=LOFF_T0000224;transcript_name=NBPF14-0;Parent=LOFF_T0000224;ID=exon:LOFF_T0000224:0;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A chr1 Liftoff three_prime_UTR 146568094 146569221 . - . gene_name=NBPF14;source_gene=ENSG00000270629.6;gene_biotype=protein_coding;transcript_biotype=protein_coding;source_transcript=ENST00000619423.4;Name=NBPF14;source_gene_common_name=NBPF14;extra_paralog=False;gene_id=LOFF_G0000157;transcript_id=LOFF_T0000224;transcript_name=NBPF14-0;Parent=LOFF_T0000224;ID=three_prime_UTR:LOFF_T0000224;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A
-- Reply to this email directly or view it on GitHub: https://github.com/marbl/CHM13/issues/73#issuecomment-1442259116 You are receiving this because you were mentioned.
Message ID: @.***> Any software we would use would have the same issues with conversion
I see thanks!
Hello,
Some of the tools, like Pigeon/SQANTI3 for bulk RNA ISO-Seq requires GTF file.
Can you please provide GTF version of chm13.draft_v2.0.gene_annotation.gff3 , Original link https://s3-us-west-2.amazonaws.com/human-pangenomics/T2T/CHM13/assemblies/annotation/chm13.draft_v2.0.gene_annotation.gff3`
I checked other version of GTF such as from
https://projects.ensembl.org/hprc/
and Table Browser from UCSC etc. None of them have the same left of depth of information that is present in the original GFF3. For example I cannot find following entry any of the other GTF files from other sources:chr1 CAT gene 97934895 97937928 . + . source_gene_common_name=MSTRG.282;source_gene=None;gene_biotype=StringTie;gene_id=CHM13_G0002360;gene_name=MSTRG.282;transcript_modes=exRef;ID=CHM13_G0002360;Name=MSTRG.282;source_transcript=N/A;alternative_source_transcripts=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;paralogy=N/A;unfiltered_paralogy=N/A;alignment_id=N/A;frameshift=N/A;exon_anotation_support=N/A;intron_annotation_support=N/A;transcript_class=N/A;valid_start=N/A;valid_stop=N/A;proper_orf=N/A;extra_paralog=False
I tried to convert GFF3 to GTF using
agat
, sorted it but Pigeon is not accepting it. I tried few other options but none of them are working. It would be great to have GTF version of the filechm13.draft_v2.0.gene_annotation.gff3
Many Thanks SM