Open biozzq opened 2 years ago
RNA-SeQC requires GTF in the format specified at https://www.gencodegenes.org/pages/data_format.html, with a gene > transcript > exon hierarchy in the feature type
column (additional features like CDS etc are also supported). Your GTF is missing gene features, it only has transcripts and exonic features.
Is there a tool to convert a gtf to the required format? I am also having issues with that. Thank you in advance
Dear all,
To prepare the gtf file used in
rnaseqc
, I first converted the gff file to gtf file using following command,gffread-0.12.7.Linux_x86_64/gffread -T -o out.gtf input.gff
, however it give me error when runningcollapse_annotation.py out.gtf collapse.gtf
Based on above error message, I added gene_biotype and transcripttype information to the end of each line. `perl -e 'while(<>){chomp; print $," gene_biotype \"protein_coding\"; transcript_biotype \"protein_coding\";\n"}' out.gtf >processed.gtf`
Finally, when running
collapse_annotation.py processed.gtf collapse.gtf
, another error occured.I attached the processed.gtf here. How should this be handled? processed.zip
Thank you in advance. Best wishes, Zheng zhuqing