popgenmethods / pyrho

Fast inference of fine-scale recombination rates based on fused-LASSO
MIT License
44 stars 6 forks source link

Mixed population genomic maps #28

Closed JosephLalli closed 1 year ago

JosephLalli commented 1 year ago

Hello @jeffspence,

Thanks again for this wonderful tool. I have two questions, posted as separate Github issues.

Say I am interested in creating a 'universal' genomic map (think the 'hg38' maps that are linked to on the FAQs of plink, Beagle, SHAPEIT 2/4/5, etc).

Would you recommend averaging the values of population-specific maps? Or running pyrho on one large dataset containing samples from different populations?

Thanks, Joe

jeffspence commented 1 year ago

Hi @JosephLalli,

I never looked into this, so it might be good to check by simulation, but my intuition is that it should be much better to average population-specific maps than to run pyrho on a combined dataset. The reason is that pyrho assumes the samples are unrelated individuals from a panmictic population, which will be more violated for a combined dataset than for samples from the same "population". In particular, having individuals from different populations introduces additional "ancestry" LD, which I suspect would cause pyrho to underestimate the recombination rate.

Hope this helps, Jeff

JosephLalli commented 1 year ago

Thank you for the quick and thorough reply! It is a tricky problem, since as far as I can tell you're really looking for the best option from a series of bad ones. I appreciate your insight.