hyanwong / giglib

MIT License
4 stars 2 forks source link

Modelling the short arm acrocentric arms of human chromosomes #12

Open hyanwong opened 11 months ago

hyanwong commented 11 months ago

I gather from the T2T papers that the short arms of the acrocentric chromosomes (e.g. 13 & 14) exchange material e.g. through gene conversion (and maybe even recombination). This would be a neat test-case for the GIG encoding, especially if #11 is implemented.

hyanwong commented 5 months ago

I think once we solve #103, this becomes almost trivial to implement. The main issue is going to be finding MRCAs that are older than the most recent coalescent point.

More specifically, imagine a meiosis between two human genomes, where we are only simulating chromosome 13 & 14. We might want to look for common region between the short arm of 13 on U and the sort arms of V on both 13 and 14. If we use the find_mrca_regions function and specify both chromosomes at once, we will probably find just one (either 13 or 14, depending on where the most recent crossover happened to occur: see https://github.com/hyanwong/GeneticInheritanceGraphLibrary/issues/111). I suspect that this will not give the expected dynamics

A workaround would be to try to spot MRCA regions separately between u13 vs v13 and u13 vs v14. For completeness and symmetry, we would also therefore need to do u14/v13 and u14/v14, and figure out how to ensure that there were only 2 matching pairs. This seems a bit of a hack, but probably biologically OK, if we imagine that chromosome pairing during meiosis uses e.g. random proximity first, then matching once a "reasonably similar" chromosome has been found. It does, however, require use to explicitly specify that 13 and 14 have similar regions, rather than letting this emerge from the simulation setup.