Open m-huertasp opened 1 year ago
Hello,
Well, probably the simplest thing to do would be to use the hg38 map here: https://bochet.gcc.biostat.washington.edu/beagle/genetic_maps/. You may need to slightly reformat it for use with selscan. I believe this map is a liftOver of a map inferred on an earlier build.
-Zachary
Hello @szpiech!
Thank you very much for your suggestion!
I used the hg38 map as you suggested, but we encountered an irregularity when analysing our iHS values.
We are using data from the Genome-Tissue Expression (GTEx) project but 1000 Genomes Project recombination map. When comparing our iHS values with the ones published in Pybus M et. al. Nucleic Acids Res. 2014, we observed no correlation at all (although analysing same populations).
We are not sure if this lack of correlation is due to using the recombination map from the 1000 Genomes and not one from GTEx, as some positions from GTEx are not covered in the recombination map and the other way around. Another possibility is that values do not correlate because of the change in the way iHS is computed (iHH1 and iHH0 swapped in selscan but original formula used in Pybus et. al.), but I found the differences in iHS too huge to be due to this change.
I would be extremely grateful for any assistance you could provide.
Sincerely, Marta.
Well, I'm not sure precisely why this happened. I'm assuming you normalized the scores in frequency bins, as described in Voight et al 2006 and as is implemented in the norm program. If you haven't, this would almost surely be the problem.
I doubt the difference in genetic maps is the full cause, although I suppose it would contribute to it. You might multiply your scores by -1 just to check, but you would expect to see a strong negative correlation if this was the only issue.
-Zachary
On Wed, May 24, 2023 at 7:05 AM m-huertasp @.***> wrote:
Hello @szpiech https://github.com/szpiech!
Thank you very much for your suggestion!
I used the hg38 map as you suggested, but we encountered an irregularity when analysing our iHS values.
We are using data from the Genome-Tissue Expression (GTEx) project but 1000 Genomes Project recombination map. When comparing our iHS values with the ones published in Pybus M et. al. Nucleic Acids Res. 2014, we observed no correlation at all (although analysing same populations).
We are not sure if this lack of correlation is due to using the recombination map from the 1000 Genomes and not one from GTEx, as some positions from GTEx are not covered in the recombination map and the other way around. Another possibility is that values do not correlate because of the change in the way iHS is computed (iHH1 and iHH0 swapped in selscan but original formula used in Pybus et. al.), but I found the differences in iHS too huge to be due to this change.
I would be extremely grateful for any assistance you could provide.
Sincerely, Marta.
— Reply to this email directly, view it on GitHub https://github.com/szpiech/selscan/issues/96#issuecomment-1560915892, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABAKRQQLSEE4RWO5M65BGV3XHXTQLANCNFSM6AAAAAAWRENRME . You are receiving this because you were mentioned.Message ID: @.***>
Hi @szpiech!
I have a question regarding the use of physical map instead of genetic map when computing iHS. I am using GTEx data to analyze some regions that could be subject to positive selection in different tissues. GTEx data is build based in GRCh38 assembly. I have not found information about genetic maps based in this assembly. The only information that I could find was about the 1000 Genomes Project, which is based in GRCh37 (or at least the information about genetic map is based in that assembly). I was wondering which is the best option:
Thank you very much for this great tool. Looking forward to hearing from you!
Marta.