isaacovercast / easySFS

Effective selection of population size projection for construction of the site frequency spectrum. Convert VCF to dadi/fastsimcoal style SFS for demographic analysis
124 stars 23 forks source link

Minor allele freq for >2 pops #8

Open isaacovercast opened 6 years ago

isaacovercast commented 6 years ago

The minor allele should be determined based on all populations combined for SFS constructed from more than 2 populations:

https://groups.google.com/forum/#!searchin/fastsimcoal/sfs$20multiple$20populations%7Csort:date/fastsimcoal/zWO_ERhHjOg/cnPXsDCXjRAJ

EDIT: This is only a problem for unfolded SFS, folded SFS doesn't care about minor allele.

cbrock2 commented 4 years ago

Hi Isaac,

Just to be clear, is this how you are currently calculating the joint MAF file for >2 populations? I running fastsimcoal2 with 4 populations and would prefer to use the joint MAFs given the estimation issues with zero-laden MSFS files. Thanks so much for the great program.

Best,

Chad

isaacovercast commented 4 years ago

The minor allele for SFS with > 2 populations is determined by the frequency of alleles only within the sampled populations, not the global allele frequencies. In practice this shouldn't be a huge issue.

cbrock2 commented 4 years ago

Thanks, Isaac.

I have been having some large deviations in results using the multiple joint SFS vs MSFS for greater than two populations (for two pops the results are very similar), so I thought maybe this was the culprit.

Thanks for the clarification and advice.

Best,

Chad

qinshengyuan commented 3 years ago

Hi Isaac,

when i run the easySFS, all of the observation is zero, such as 1 observation d0_0 d0_1 d0_2 d0_3 d0_4 d0_5 d0_6 0 0 0 0 0 0 0 1 observation d0_0 d0_1 d0_2 d0_3 d1_0 0 0 0 0 d1_1 0 0 0 0 d1_2 0 0 0 0 d1_3 0 0 0 0 d1_4 0 0 0 0 d1_5 0 0 0 0 d1_6 0 0 0 0 what's wrong with my data?

Best,

Qin

isaacovercast commented 3 years ago

@qinshengyuan Good question! Since you don't show me your data, I have no idea what is wrong with it. What are the results when you run the --proj function?