marbl / CHM13

The complete sequence of a human genome
Other
920 stars 99 forks source link

GTF format of the orignal GFF #73

Closed sum732 closed 1 year ago

sum732 commented 1 year ago

Hello,

Some of the tools, like Pigeon/SQANTI3 for bulk RNA ISO-Seq requires GTF file.

Can you please provide GTF version of chm13.draft_v2.0.gene_annotation.gff3 , Original link https://s3-us-west-2.amazonaws.com/human-pangenomics/T2T/CHM13/assemblies/annotation/chm13.draft_v2.0.gene_annotation.gff3`

I checked other version of GTF such as from https://projects.ensembl.org/hprc/ and Table Browser from UCSC etc. None of them have the same left of depth of information that is present in the original GFF3. For example I cannot find following entry any of the other GTF files from other sources: chr1 CAT gene 97934895 97937928 . + . source_gene_common_name=MSTRG.282;source_gene=None;gene_biotype=StringTie;gene_id=CHM13_G0002360;gene_name=MSTRG.282;transcript_modes=exRef;ID=CHM13_G0002360;Name=MSTRG.282;source_transcript=N/A;alternative_source_transcripts=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;paralogy=N/A;unfiltered_paralogy=N/A;alignment_id=N/A;frameshift=N/A;exon_anotation_support=N/A;intron_annotation_support=N/A;transcript_class=N/A;valid_start=N/A;valid_stop=N/A;proper_orf=N/A;extra_paralog=False

I tried to convert GFF3 to GTF using agat, sorted it but Pigeon is not accepting it. I tried few other options but none of them are working. It would be great to have GTF version of the file chm13.draft_v2.0.gene_annotation.gff3

Many Thanks SM

diekhans commented 1 year ago

The gffread program has an option to convert to GTF.

https://github.com/gpertea/gffread

sum732 @.***> writes:

Hello,

Some of the tools, like Pigeon/SQANTI3 for bulk RNA ISO-Seq requires GTF file.

Can you please provide GTF version of chm13.draft_v2.0.gene_annotation.gff3 , Original link https://s3-us-west-2.amazonaws.com/human-pangenomics/T2T/CHM13/assemblies/annotation/chm13.draft_v2.0.gene_annotation.gff3`

I checked other version of GTF such as from https://projects.ensembl.org/hprc/ and Table Browser from UCSC etc. None of them have the same left of depth of information that is present in the original GFF3. For example I cannot find following entry any of the other GTF files from other sources: chr1 CAT gene 97934895 97937928 . + . source_gene_common_name=MSTRG.282;source_gene=None;gene_biotype=StringTie;gene_id=CHM13_G0002360;gene_name=MSTRG.282;transcript_modes=exRef;ID=CHM13_G0002360;Name=MSTRG.282;source_transcript=N/A;alternative_source_transcripts=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;paralogy=N/A;unfiltered_paralogy=N/A;alignment_id=N/A;frameshift=N/A;exon_anotation_support=N/A;intron_annotation_support=N/A;transcript_class=N/A;valid_start=N/A;valid_stop=N/A;proper_orf=N/A;extra_paralog=False

I tried to convert GFF3 to GTF using agat, sorted it but Pigeon is not accepting it. I tried few other options but none of them are working. It would be great to have GTF version of the file chm13.draft_v2.0.gene_annotation.gff3

Many Thanks SM

-- Reply to this email directly or view it on GitHub: https://github.com/marbl/CHM13/issues/73 You are receiving this because you are subscribed to this thread.

Message ID: @.***>

sum732 commented 1 year ago

Hi @diekhans, Thanks for replying.

Indeed it is one of the option, but most the details are ignored. Example here

Also in the original GFF3 there are following entries, please notice the START and END of the first 3 and next 2:

chr1    Liftoff transcript      146568094       146569221       .       -       .       gene_name=NBPF14;source_gene=ENSG00000270629.6;gene_biotype=protein_coding;transcript_biotype=protein_coding;source_transcript=ENST00000619423.4;Name=NBPF14;source_gene_common_name=NBPF14;extra_paralog=False;gene_id=LOFF_G0000157;Parent=LOFF_G0000157;transcript_id=LOFF_T0000224;transcript_name=NBPF14-1;ID=LOFF_T0000224;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A
chr1    Liftoff exon    146568094       146569221       .       -       .       gene_name=NBPF14;source_gene=ENSG00000270629.6;gene_biotype=protein_coding;transcript_biotype=protein_coding;source_transcript=ENST00000619423.4;Name=NBPF14;source_gene_common_name=NBPF14;extra_paralog=False;gene_id=LOFF_G0000157;transcript_id=LOFF_T0000224;transcript_name=NBPF14-0;Parent=LOFF_T0000224;ID=exon:LOFF_T0000224:0;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A
chr1    Liftoff three_prime_UTR 146568094       146569221       .       -       .       gene_name=NBPF14;source_gene=ENSG00000270629.6;gene_biotype=protein_coding;transcript_biotype=protein_coding;source_transcript=ENST00000619423.4;Name=NBPF14;source_gene_common_name=NBPF14;extra_paralog=False;gene_id=LOFF_G0000157;transcript_id=LOFF_T0000224;transcript_name=NBPF14-0;Parent=LOFF_T0000224;ID=three_prime_UTR:LOFF_T0000224;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A

chr16   Liftoff transcript      13397099        13397303        .       -       .       gene_name=LINC02851;source_gene=ENSG00000229611.2;gene_biotype=lncRNA;transcript_biotype=lncRNA;source_transcript=ENST00000664463.1;Name=LINC02851;source_gene_common_name=LINC02851;extra_paralog=False;gene_id=LOFF_G0001003;Parent=LOFF_G0001003;transcript_id=LOFF_T0001232;transcript_name=LINC02851-1;ID=LOFF_T0001232;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A
chr16   Liftoff exon    13397099        13397303        .       -       .       gene_name=LINC02851;source_gene=ENSG00000229611.2;gene_biotype=lncRNA;transcript_biotype=lncRNA;source_transcript=ENST00000664463.1;Name=LINC02851;source_gene_common_name=LINC02851;extra_paralog=False;gene_id=LOFF_G0001003;transcript_id=LOFF_T0001232;transcript_name=LINC02851-0;Parent=LOFF_T0001232;ID=exon:LOFF_T0001232:0;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A

Should they be merged? If so how what biotype should be given? What else could be there that need further attention, hence the request to original authors to generate a GTF version as well.

Best Regards SM

diekhans commented 1 year ago

Any software we would use would have the same issues with conversion to GTF and we don't have the resource to write another program.

I would suggest pulling the attributes from are interested in from the GFF3 and merge them back in later. The UCSC utility gff3ToGenePred can extract the attributes into an easy to parse format. gff3ToGenePred -attrsOut=some.attrs some.gff3 /dev/null

I don't understand the issue with your example. They are on different chromosomes and transcripts of different genes. Can you explain in more detail?

Note that future releases of the CHM13 gene annotations will be provider by Ensembl as part of the HPRC grant. If you need more information from them, please make request to the ensembl help desk.

There is an issue were Ensembl HPRC GFF3 were some attributes get encoded in the description. I am talking with them about address this.

sum732 @.***> writes:

Hi @diekhans, Thanks for replying.

Indeed it is one of the option, but most the details are ignored. Example here

Also in the original GFF3 there are following entries, please notice the START and END of the first 3 and next 2:

chr1    Liftoff transcript      146568094       146569221       .       -       .       gene_name=NBPF14;source_gene=ENSG00000270629.6;gene_biotype=protein_coding;transcript_biotype=protein_coding;source_transcript=ENST00000619423.4;Name=NBPF14;source_gene_common_name=NBPF14;extra_paralog=False;gene_id=LOFF_G0000157;Parent=LOFF_G0000157;transcript_id=LOFF_T0000224;transcript_name=NBPF14-1;ID=LOFF_T0000224;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A
chr1    Liftoff exon    146568094       146569221       .       -       .       gene_name=NBPF14;source_gene=ENSG00000270629.6;gene_biotype=protein_coding;transcript_biotype=protein_coding;source_transcript=ENST00000619423.4;Name=NBPF14;source_gene_common_name=NBPF14;extra_paralog=False;gene_id=LOFF_G0000157;transcript_id=LOFF_T0000224;transcript_name=NBPF14-0;Parent=LOFF_T0000224;ID=exon:LOFF_T0000224:0;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A
chr1    Liftoff three_prime_UTR 146568094       146569221       .       -       .       gene_name=NBPF14;source_gene=ENSG00000270629.6;gene_biotype=protein_coding;transcript_biotype=protein_coding;source_transcript=ENST00000619423.4;Name=NBPF14;source_gene_common_name=NBPF14;extra_paralog=False;gene_id=LOFF_G0000157;transcript_id=LOFF_T0000224;transcript_name=NBPF14-0;Parent=LOFF_T0000224;ID=three_prime_UTR:LOFF_T0000224;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A

chr16   Liftoff transcript      13397099        13397303        .       -       .       gene_name=LINC02851;source_gene=ENSG00000229611.2;gene_biotype=lncRNA;transcript_biotype=lncRNA;source_transcript=ENST00000664463.1;Name=LINC02851;source_gene_common_name=LINC02851;extra_paralog=False;gene_id=LOFF_G0001003;Parent=LOFF_G0001003;transcript_id=LOFF_T0001232;transcript_name=LINC02851-1;ID=LOFF_T0001232;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A
chr16   Liftoff exon    13397099        13397303        .       -       .       gene_name=LINC02851;source_gene=ENSG00000229611.2;gene_biotype=lncRNA;transcript_biotype=lncRNA;source_transcript=ENST00000664463.1;Name=LINC02851;source_gene_common_name=LINC02851;extra_paralog=False;gene_id=LOFF_G0001003;transcript_id=LOFF_T0001232;transcript_name=LINC02851-0;Parent=LOFF_T0001232;ID=exon:LOFF_T0001232:0;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A

Should they be merged? If so how what biotype should be given? What else could be there that need further attention, hence the request to original authors to generate a GTF version as well.

Best Regards SM

diekhans commented 1 year ago

I also appears that gffread --table option can output the attributes in an easy to parse format.

sum732 commented 1 year ago

I don't understand the issue with your example. They are on different chromosomes and transcripts of different genes. Can you explain in more detail?

Hi @diekhans , thanks for replying. The example above are two different things with the same issue. Lets take the first. Same start and end, should these be collapsed? and if so what should be the Biotype?

chr1    Liftoff transcript      146568094       146569221       .       -       .       gene_name=NBPF14;source_gene=ENSG00000270629.6;gene_biotype=protein_coding;transcript_biotype=protein_coding;source_transcript=ENST00000619423.4;Name=NBPF14;source_gene_common_name=NBPF14;extra_paralog=False;gene_id=LOFF_G0000157;Parent=LOFF_G0000157;transcript_id=LOFF_T0000224;transcript_name=NBPF14-1;ID=LOFF_T0000224;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A
chr1    Liftoff exon    146568094       146569221       .       -       .       gene_name=NBPF14;source_gene=ENSG00000270629.6;gene_biotype=protein_coding;transcript_biotype=protein_coding;source_transcript=ENST00000619423.4;Name=NBPF14;source_gene_common_name=NBPF14;extra_paralog=False;gene_id=LOFF_G0000157;transcript_id=LOFF_T0000224;transcript_name=NBPF14-0;Parent=LOFF_T0000224;ID=exon:LOFF_T0000224:0;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A
chr1    Liftoff three_prime_UTR 146568094       146569221       .       -       .       gene_name=NBPF14;source_gene=ENSG00000270629.6;gene_biotype=protein_coding;transcript_biotype=protein_coding;source_transcript=ENST00000619423.4;Name=NBPF14;source_gene_common_name=NBPF14;extra_paralog=False;gene_id=LOFF_G0000157;transcript_id=LOFF_T0000224;transcript_name=NBPF14-0;Parent=LOFF_T0000224;ID=three_prime_UTR:LOFF_T0000224;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A
diekhans commented 1 year ago

This is a correct representation of a bad gene annotations. It is a fragment of exon of NBPF14 that was mapped to a different due to a segments duplication.

The type field is not the biotype, it is the type of the feature. We have an transcript, which consists of an exon and the exon is all 3'UTR.

All of this is truncated part of a gene annotation.

The UCSC browser also has gene annotations from RefSeq on CHM13.

My recent experience from looking at gene annotations on CHM13 from multiple sources is they all have problems, especially with recently duplicated gene annotations.

sum732 @.***> writes:

Any software we would use would have the same issues with conversion to GTF and we don't have the resource to write another program. I would suggest pulling the attributes from are interested in from the GFF3 and merge them back in later. The UCSC utility gff3ToGenePred can extract the attributes into an easy to parse format. gff3ToGenePred -attrsOut=some.attrs some.gff3 /dev/null I don't understand the issue with your example. They are on different chromosomes and transcripts of different genes. Can you explain in more detail? Note that future releases of the CHM13 gene annotations will be provider by Ensembl as part of the HPRC grant. If you need more information from them, please make request to the ensembl help desk. There is an issue were Ensembl HPRC GFF3 were some attributes get encoded in the description. I am talking with them about address this. sum732 @.***> writes: Hi @diekhans, Thanks for replying. Indeed it is one of the option, but most the details are ignored. Example here Also in the original GFF3 there are following entries, please notice the START and END of the first 3 and next 2: chr1 Liftoff transcript 146568094 146569221 . - . gene_name=NBPF14;source_gene=ENSG00000270629.6;gene_biotype=protein_coding;transcript_biotype=protein_coding;source_transcript=ENST00000619423.4;Name=NBPF14;source_gene_common_name=NBPF14;extra_paralog=False;gene_id=LOFF_G0000157;Parent=LOFF_G0000157;transcript_id=LOFF_T0000224;transcript_name=NBPF14-1;ID=LOFF_T0000224;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A chr1 Liftoff exon 146568094 146569221 . - . gene_name=NBPF14;source_gene=ENSG00000270629.6;gene_biotype=protein_coding;transcript_biotype=protein_coding;source_transcript=ENST00000619423.4;Name=NBPF14;source_gene_common_name=NBPF14;extra_paralog=False;gene_id=LOFF_G0000157;transcript_id=LOFF_T0000224;transcript_name=NBPF14-0;Parent=LOFF_T0000224;ID=exon:LOFF_T0000224:0;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A chr1 Liftoff three_prime_UTR 146568094 146569221 . - . gene_name=NBPF14;source_gene=ENSG00000270629.6;gene_biotype=protein_coding;transcript_biotype=protein_coding;source_transcript=ENST00000619423.4;Name=NBPF14;source_gene_common_name=NBPF14;extra_paralog=False;gene_id=LOFF_G0000157;transcript_id=LOFF_T0000224;transcript_name=NBPF14-0;Parent=LOFF_T0000224;ID=three_prime_UTR:LOFF_T0000224;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A chr16 Liftoff transcript 13397099 13397303 . - . gene_name=LINC02851;source_gene=ENSG00000229611.2;gene_biotype=lncRNA;transcript_biotype=lncRNA;source_transcript=ENST00000664463.1;Name=LINC02851;source_gene_common_name=LINC02851;extra_paralog=False;gene_id=LOFF_G0001003;Parent=LOFF_G0001003;transcript_id=LOFF_T0001232;transcript_name=LINC02851-1;ID=LOFF_T0001232;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A chr16 Liftoff exon 13397099 13397303 . - . gene_name=LINC02851;source_gene=ENSG00000229611.2;gene_biotype=lncRNA;transcript_biotype=lncRNA;source_transcript=ENST00000664463.1;Name=LINC02851;source_gene_common_name=LINC02851;extra_paralog=False;gene_id=LOFF_G0001003;transcript_id=LOFF_T0001232;transcript_name=LINC02851-0;Parent=LOFF_T0001232;ID=exon:LOFF_T0001232:0;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A Should they be merged? If so how what biotype should be given? What else could be there that need further attention, hence the request to original authors to generate a GTF version as well. Best Regards SM

Hi @diekhans , thanks for replying. The example above are two different things with the same issue. Lets take the first. Same start and end, should these be collapsed? and if so what should be the Biotype?

chr1    Liftoff transcript      146568094       146569221       .       -       .       gene_name=NBPF14;source_gene=ENSG00000270629.6;gene_biotype=protein_coding;transcript_biotype=protein_coding;source_transcript=ENST00000619423.4;Name=NBPF14;source_gene_common_name=NBPF14;extra_paralog=False;gene_id=LOFF_G0000157;Parent=LOFF_G0000157;transcript_id=LOFF_T0000224;transcript_name=NBPF14-1;ID=LOFF_T0000224;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A
chr1    Liftoff exon    146568094       146569221       .       -       .       gene_name=NBPF14;source_gene=ENSG00000270629.6;gene_biotype=protein_coding;transcript_biotype=protein_coding;source_transcript=ENST00000619423.4;Name=NBPF14;source_gene_common_name=NBPF14;extra_paralog=False;gene_id=LOFF_G0000157;transcript_id=LOFF_T0000224;transcript_name=NBPF14-0;Parent=LOFF_T0000224;ID=exon:LOFF_T0000224:0;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A
chr1    Liftoff three_prime_UTR 146568094       146569221       .       -       .       gene_name=NBPF14;source_gene=ENSG00000270629.6;gene_biotype=protein_coding;transcript_biotype=protein_coding;source_transcript=ENST00000619423.4;Name=NBPF14;source_gene_common_name=NBPF14;extra_paralog=False;gene_id=LOFF_G0000157;transcript_id=LOFF_T0000224;transcript_name=NBPF14-0;Parent=LOFF_T0000224;ID=three_prime_UTR:LOFF_T0000224;alignment_id=N/A;alternative_source_transcripts=N/A;paralogy=N/A;unfiltered_paralogy=N/A;collapsed_gene_ids=N/A;collapsed_gene_names=N/A;frameshift=N/A;exon_annotation_support=N/A;intron_annotation_support=N/A;transcript_class=ortholog;transcript_modes=Liftoff;valid_start=N/A;valid_stop=N/A;proper_orf=N/A

-- Reply to this email directly or view it on GitHub: https://github.com/marbl/CHM13/issues/73#issuecomment-1442259116 You are receiving this because you were mentioned.

Message ID: @.***> Any software we would use would have the same issues with conversion

sum732 commented 1 year ago

I see thanks!