Closed JosephLalli closed 1 year ago
Hi @JosephLalli,
I never looked into this, so it might be good to check by simulation, but my intuition is that it should be much better to average population-specific maps than to run pyrho
on a combined dataset. The reason is that pyrho
assumes the samples are unrelated individuals from a panmictic population, which will be more violated for a combined dataset than for samples from the same "population". In particular, having individuals from different populations introduces additional "ancestry" LD, which I suspect would cause pyrho
to underestimate the recombination rate.
Hope this helps, Jeff
Thank you for the quick and thorough reply! It is a tricky problem, since as far as I can tell you're really looking for the best option from a series of bad ones. I appreciate your insight.
Hello @jeffspence,
Thanks again for this wonderful tool. I have two questions, posted as separate Github issues.
Say I am interested in creating a 'universal' genomic map (think the 'hg38' maps that are linked to on the FAQs of plink, Beagle, SHAPEIT 2/4/5, etc).
Would you recommend averaging the values of population-specific maps? Or running pyrho on one large dataset containing samples from different populations?
Thanks, Joe