I noticed some abnormalities in the results of taco_refcomp (assembly.refcomp.gtf).
If I understand correctly, category column should be defining what the called transcript is, right? If so, I notice several transcripts which were categorised as "lncrna", albeit having cpat_coding_prob of >0.999.
ref_gene_type should be the gene_type or gene_biotype of the annotated transcript/gene. For example, in one of the transcripts above, the category_relative_detail is intronic_same_strand inside a protein_coding gene. However, the ref_gene_type for that transcript is defines as "lincRNA".
I saw them on few transcripts (not only one) and I have also checked the gtf file I used to compare. Am I missing something here?
Hello,
I noticed some abnormalities in the results of taco_refcomp (assembly.refcomp.gtf).
category
column should be defining what the called transcript is, right? If so, I notice several transcripts which were categorised as "lncrna", albeit havingcpat_coding_prob
of >0.999.ref_gene_type
should be the gene_type or gene_biotype of the annotated transcript/gene. For example, in one of the transcripts above, thecategory_relative_detail
is intronic_same_strand inside a protein_coding gene. However, theref_gene_type
for that transcript is defines as "lincRNA".I saw them on few transcripts (not only one) and I have also checked the gtf file I used to compare. Am I missing something here?