ComparativeGenomicsToolkit / cactus

Official home of genome aligner based upon notion of Cactus graphs
Other
523 stars 111 forks source link

Mis-identification in CNS #1305

Open bilibilij opened 8 months ago

bilibilij commented 8 months ago

image Around two years ago, I constructed an alignment file containing 20 species at the family level using Cactus and obtained a CNS coordinate by PhastCons and PhyloP. However, recently, we discovered that some of the alignments were misidentified as CNS. As shown in the picture, the reference species contained a large segment with a length of 656 bp, while other species only contained two bp in the tail in the MAF file extracted by Hal2Maf. Instead of a conserved region across species, this is probably a species-specific region for the reference species. Is there something wrong with my default parameters for Hal2Maf?

bilibilij commented 8 months ago

This region was served as a highly conserved region by phastcon and phylop.

glennhickey commented 8 months ago

I'm not sure I understand. If PhyloP is identifying this region of your MAF as being conserved, wouldn't this be an issue with PhlyloP (rather than the MAF)?