gpertea / gffcompare

classify, merge, tracking and annotation of GFF files by comparing to a reference annotation GFF
MIT License
198 stars 32 forks source link

GFFCompare ".tracking" file not containing all transcript IDs of all the provided annotation files #85

Open RobAlbn opened 7 months ago

RobAlbn commented 7 months ago

Hi! I am using GFFCompare and I am providing it with a list of GTF annotation files (without a reference annotation). My goal is to get information from the ".tracking" file, which is a GFFCompare output file. As explained in the documentation, in this file, each of the columns following the 4th column is related to one of the provided annotation files.

The issue is the following. When focusing on each of these columns, some transcript IDs of the corresponding annotation file are missing. In order to understand why this happened, I focused on one column/annotation file. I created two new annotation files. First, I created a "reduced annotation", which is simply the starting annotation, but without information about the transcript IDs that were missing in the ".tracking" file. Then, I created an annotation file with information about the transcript IDs that were missing in the ".tracking" file.

I then compared the annotation file of missing transcripts with the reduced annotation (with GFFCompare, by specifying the reduced annotation as the reference with -r). I inspected class codes in the ".tmap" output file. Most of the transcripts that are missing in the ".tracking" file have class codes "=", so they are duplicates of the starting annotation, and it makes sense to remove them from the ".tracking" file. However, some of the missing transcripts have class codes "c" and "k". These "c" and "k" transcripts are all monoexonic transcripts.

Which could be the reason why these monoexonic "c" and "k" transcripts are not present in the ".tracking" file? To my understanding, the ".tracking" file should contain all transcripts of all annotation files (except for duplicates, of course). Why is this (apparently) not the case? Thank you very much for your help and support.