jorvis / biocode

Bioinformatics code libraries and scripts
MIT License
504 stars 247 forks source link

report_gc_content_by_feature_type.pl definition of telomere #19

Closed ktretina closed 10 years ago

ktretina commented 10 years ago

It looks to me like report_gc_content_by_feature_type.pl defines a telomere as the region between the terminal exon on a contig and the end of the contig. Perhaps this definition varies by field, but I don't think that this is typically seen as the biological definition of a telomere, which is typically defined as something like "a region of repetitive sequence at the end of a chromatid." The two problems that come to mind are:

1) Not all contigs contain the ends of chromosomes (i.e. chromosomes may not be fully sequenced or broken into several contigs). 2) It includes the region between the repetitive sequence and the first annotation gene, which I think is usually considered part of the sub-telomeric region.

Maybe there was some application-specific reason for this addition. Please correct me if I am wrong on this, but I know a little about how you like to be very precise with your terminology, so I thought I should bring it up to be looked at further.

jorvis commented 10 years ago

Yes, this was added at the specific request of a PI for whom the implementation made sense, but I agree that it doesn't generally. I've removed it from the output and documentation for now.