szpiech / lassip

LASSI-Plus: A program to calculate haplotype frequency spectrum statistics
GNU General Public License v3.0
6 stars 2 forks source link

Minor allele filtering #8

Open tbusschau opened 1 week ago

tbusschau commented 1 week ago

Hi Zachary,

What is your recommendation on minor allele filtering before running saltiLassi? I'm thinking to filter maf <0.05 as with the other haplotype-based statistics I'm running, but I'm not sure how this will affect the results.

Best, Theo

szpiech commented 1 week ago

Hi Theo,

Well, this isn’t something I have systematically tested. When we wrote the paper and evaluated the method we only used whole genome sequencing (or simulated) data with no MAF filter.

It’s probably worth exploring, but I can’t say for sure what the results would be if you applied a MAF filter. Although, I would guess the window size in number of snps would probably decrease due to lower snp density along the genome.

Zachary

Le mer. 26 juin 2024 à 05:11, tbusschau @.***> a écrit :

Hi Zachary,

What is your recommendation on minor allele filtering before running saltiLassi? I'm thinking to filter maf <0.05 as with the other haplotype-based statistics I'm running, but I'm not sure how this will affect the results.

Best, Theo

— Reply to this email directly, view it on GitHub https://github.com/szpiech/lassip/issues/8, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABAKRQWFEKLIQTK33UBBRCDZJKAUNAVCNFSM6AAAAABJ5OOH6SVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM3TINZZGY2TMMY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

tbusschau commented 3 days ago

Hi Zachary, Thank you for your response. I can try the filter to see if there is a significant difference in the results. By removing low frequency variants, wouldn't that skew the haplotype frequency spectrum too? I imagine there will be fewer haplotypes and a higher frequency of the more common haplotypes.

I'm also wondering, since we're looking at haplotypes and not considering core snps, is it possible for balancing selection to be detected as a soft sweep? The haplotype frequency spectrum may be distorted in a similar way under both scenarios. It does seem that way for my data at least.

Best, Theo

szpiech commented 3 days ago

Hi,

So, yes it would influence the number and frequency of haplotypes. It would do this genome wide, though, so in principle I’d expect I’d expect the method to still function. Although I’ve never tested it.

Yes re balancing selection. Since formally we are looking for regions of the genome with a distorted HFS relative to background, balancing selection could look like a very soft sweep. This is likely what’s happening at the human MHC locus signal from the plos genetics paper.

-Zachary

Le jeu. 4 juil. 2024 à 04:20, tbusschau @.***> a écrit :

Hi Zachary, Thank you for your response. I can try the filter to see if there is a significant difference in the results. By removing low frequency variants, wouldn't that skew the haplotype frequency spectrum too? I imagine there will be fewer haplotypes and a higher frequency of the more common haplotypes.

I'm also wondering, since were looking at haplotypes and not considering core snps, is it possible for balancing selection to be detected as a soft sweep? The haplotype frequency spectrum may be distorted in a similar way under both scenarios. It does seem that way for my data at least.

Best, Theo

— Reply to this email directly, view it on GitHub https://github.com/szpiech/lassip/issues/8#issuecomment-2208386520, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABAKRQWCCLZMVPAY26ZKZO3ZKUATLAVCNFSM6AAAAABJ5OOH6SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMBYGM4DMNJSGA . You are receiving this because you commented.Message ID: @.***>