ComparativeGenomicsToolkit / cactus

Official home of genome aligner based upon notion of Cactus graphs
Other
505 stars 112 forks source link

Question about ancestral genome #1324

Open Marh32 opened 6 months ago

Marh32 commented 6 months ago

Hi,

When using hal2fasta to output the ancestral genome, I notice that the resulting ancestral FASTA file contains many lowercase letters. Does this indicate repeat regions? And should I perform repeat masking again on this output? Thanks in advance for your help.

Best regards, Hao

glennhickey commented 6 months ago

I think it works where the ancestor base is softmasked if the majority of descendant bases from which it was reconstructed were softmasked.

Marh32 commented 6 months ago

Thank you for your reply. I re-masked the genome using RepeatMasker and obtained significantly different results, in this case, which result should I consider as the accurate one?

Best regards, Hao