ncbi / TPMCalculator

TPMCalculator quantifies mRNA abundance directly from the alignments by parsing BAM files
Other
124 stars 33 forks source link

Key ID for gene name was not found on GTF line #78

Closed Liyong-Zhang closed 2 years ago

Liyong-Zhang commented 2 years ago

Hello,

I am using tpmcalculator (version 0.0.4) in conda env with a .gff3 annotation file. After running couple times with different -k parameters (first time default, second time ID, third time Parent), I got the same error message "Key gene_id/ID/Parent for gene name was not found on GTF line. Error processing GTF line at Chromosome level:"

The parameters are as following:

tpmcalculator \ -g data/annotation/Cs_genes_v2_annot.gff3 \ -d $input1_dir \ -b Aligned.sortedByCoord.out.bam \ -p \ -k "Parent"

The first couple line of my .gff3 annotation file are: Chr1 AAFC_NRC gene 1 6504 . - . ID=Csa01g001000;Name=Csa01g001000;Note=methyl-CPG-binding domain 9

Chr1 AAFC_NRC gene 1 6504 . - . ID=Csa01g001000;Name=Csa01g001000

Chr1 AAFC_NRC mRNA 1 6504 . - . ID=Csa01g001000.1;Name=Csa01g001000.1;Parent=Csa01g001000;Note=methyl-CPG-binding domain 9

Chr1 AAFC_NRC five_prime_UTR 6380 6504 . - . ID=Csa01g001000.1.utr5p1;Parent=Csa01g001000.1

Chr1 AAFC_NRC exon 5865 6504 . - . ID=Csa01g001000.1.exon1;Parent=Csa01g001000.1

Does TPMCalculator not work with .gff3 annotation file? or I got some setting wrong when running the program?

Thank you in advance.

r78v10a07 commented 2 years ago

Hi, TPMCalculator cannot read GFF files. You need to convert the GFF to GTF. Have a look at this threads on how to convert GFF to GTF. https://www.biostars.org/p/45791/

Liyong-Zhang commented 2 years ago

Hello,

Sounds good. I will convert to GTF file first before running tpmcalculator. Thanks.

sagarutturkar commented 1 year ago

I have similar issue. I downloaded GFF file from here and then converted to GTF format using AGAT tool:

agat_convert_sp_gff2gtf.pl --gff C_auris_B11221_features.gff -o C_auris_B11221.gtf

A snapshot of GTF file is attached. I get an error as:

Reading GTF file ...
Key gene for gene name was not found on GTF line.
Error processing GTF line at Chromosome level:

PGLS01000002_C_auris_B11221     CGD     exon    1330    2913    .       +       .       ID "CJI97_001076-T-E1"; Parent "CJI97_001076-T"; gene_id CJI97_001076; transcript_id "CJI97_001076-T"

See attached: test.gtf.txt

Do you have any suggestions to make this work?

Thanks