gpertea / gffcompare

classify, merge, tracking and annotation of GFF files by comparing to a reference annotation GFF
MIT License
198 stars 32 forks source link

"Warning: merging adjacent/overlapping segments" when using GFFCompare #86

Open RobAlbn opened 7 months ago

RobAlbn commented 7 months ago

Hi gpertea! I am running GFFCompare by providing it with a list of GTF files. For some transcripts of a specific annotation file, GFFCompare returns this warning: "Warning: merging adjacent/overlapping segments (distance=4)" (Side note: the distance value ranges from 1 to 4 in these warnings).

I looked for this warning message on the web, and I found the closed issue "merging overlapping/adjacent feature" in the gffread repository (link here: https://github.com/gpertea/gffread/issues/30).

Based on your reply to the aforementioned issue, I guess the reason why I get these warnings is the presence of very short introns in some transcripts of the annotation file: in your reply you wrote "gffread was written mostly for eukaryotic annotation and it always had this builtin assumption that introns should be at least 4-5 nucleotides long (many other programs assume at least 10 and I don't think there is biological evidence of introns shorter than that in eukaryotes)".

In the end, this means exons that are separated by very short introns are merged together. Is there any method to disable this merging when using GFFCompare (and just leave transcripts as they are in the annotation files, without merging "close" exons)? Thank you very much for any help or suggestion on this.