szpiech / selscan

Haplotype based scans for selection
GNU General Public License v3.0
107 stars 33 forks source link

what is recommended statistics test to scan positively selected loci (XP-nSL or XP-nHH ) between populations of two sub-species? #82

Open farhan-lab opened 2 years ago

farhan-lab commented 2 years ago

Hi Szpiech,

I am interested to search for putative positively selected loci between the populations of two sub-species (lets say sub-species "A" and "B"). Which statistics (XP-nSL or XP-nHH) in selscan would you recommend me to identify the genomic signatures under positive selection in my target sub-species "A" using "B" as reference?

Also, I have SNPs data of multiple populations (wild samples from different geographic regions) for both "A" and "B". So is it reasonable to combine all data of all populations together for both sub-species and perform this test in one run?

For example, I merged data for all pops for all SNPs loci in a single VCF file (pop1, pop2, pop3, pop4 for sub-species A and pop5, pop6, pop7, pop8 for sub-species B). Do you think I should run selscan in a way to calculate selection as "pop1 versus pop5" and so on or I can perform a single run as All pops "A" versus All pops "B"?

Many thanks in advance, Best regards, Farhan

szpiech commented 2 years ago

Hello,

I would probably use XP-nSL, so you don’t have to worry about a recombination map. As for whether you should pool your populations, that depends on how much population structure there is among them, and ultimately it’s a judgement call you’ll have to make.

Hope this helps,

Zachary

Le mar. 3 mai 2022 à 12:29 AM, farhan-lab @.***> a écrit :

Hi Szpiech,

I am interested to search for putative positively selected loci between the populations of two sub-species (lets say sub-species "A" and "B"). Which statistics (XP-nSL or XP-nHH) in selscan would you recommend me to identify the genomic signatures under positive selection in my target sub-species "A" using "B" as reference?

Also, I have SNPs data of multiple populations (wild samples from different geographic regions) for both "A" and "B". So is it reasonable to combine all data of all populations together for both sub-species and perform this test in one run?

For example, I merged data for all pops for all SNPs loci in a single VCF file (pop1, pop2, pop3, pop4 for sub-species A and pop5, pop6, pop7, pop8 for sub-species B). Do you think I should run selscan in a way to calculate selection as "pop1 versus pop5" and so on or I can perform a single run as All pops "A" versus All pops "B"?

Many thanks in advance, Best regards, Farhan

— Reply to this email directly, view it on GitHub https://github.com/szpiech/selscan/issues/82, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABAKRQUGL333MHG2S5UBB2TVICTQ5ANCNFSM5U52JTEA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

farhan-lab commented 2 years ago

Many thanks for your expert opinion!