logsdon-lab / CenStats

Centromere statistics toolkit
MIT License
0 stars 0 forks source link

Fix false calls for partial in rare cases #2

Open koisland opened 6 months ago

koisland commented 6 months ago

censtats status will miss partial calls if the edge is completely HSAT in partial centromeres like chr13 and chr21

koisland commented 3 days ago

It will also falsely call in cases where the edges of a centromeric contig do not contain a full 500kbp of the centromeric transition region.

koisland commented 3 days ago

One idea is to change the heuristic for determining partial contigs. Instead of using a flat percentage of alpha-satellite on the edges, we should calculate entropy across the contig. With the structure of the centromere, we should see a parabola with high entropy at the edges with monomeric alpha-satellite/other repeats and uniformity/low entropy in the center.