gpertea / gffcompare

classify, merge, tracking and annotation of GFF files by comparing to a reference annotation GFF
MIT License
198 stars 32 forks source link

How to keep the CDS info for each transcript? #73

Open zmz1988 opened 2 years ago

zmz1988 commented 2 years ago

Hi, I'm using gffcompare to generate a combined transcripts file from multiple resources. I'm wondering whether we could keep the CDS entry from each resource in the combined.gtf file? It's probably not difficult when gffcompare anyway grab exon info from resource for each transcript. Do you know whether we could do that?

Thanks!

XK4959 commented 1 year ago

I also encountered this problem, did you solve it?

gpertea commented 11 months ago

There is a complication here I can see: when multiple sources do not agree on the CDS, which CDS should be used in the output, for otherwise structurally-compatible transcripts ? I think the resolution here is to add an option to prevent merging of transcripts with an identical intron chain, unless their CDS is also identical.

This may have some implications in other parts of the code where transcript merging relies on intron chain identity. I think one solution is to treat this CDS-matching option like an alternative to the --strict-match option.