popsim-consortium / stdpopsim

A library of standard population genetic models
GNU General Public License v3.0
122 stars 87 forks source link

move PhoSin QC where it should go #1561

Closed petrelharp closed 4 months ago

petrelharp commented 4 months ago

There was a whoopsie in #1560 I didn't notice - the code was in qc/PhoSin.py not stdpopsim/qc/PhoSin.py. Fixing this.

codecov[bot] commented 4 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 99.83%. Comparing base (1213dc1) to head (c1d44ca).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #1561 +/- ## ======================================= Coverage 99.83% 99.83% ======================================= Files 131 131 Lines 4334 4340 +6 Branches 595 595 ======================================= + Hits 4327 4333 +6 Misses 3 3 Partials 4 4 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

petrelharp commented 4 months ago

Looks like the parameters for the gamma disagree:

>           assert np.allclose(mt1[i].distribution_args, mt2[i].distribution_args)
E           AssertionError: assert False
E            +  where False = <function allclose at 0x7fb9680175e0>([-0.00971, 0.131], [-0.0257, 0.131])
E            +    where <function allclose at 0x7fb9680175e0> = np.allclose
E            +    and   [-0.00971, 0.131] = MutationType(dominance_coeff=None, distribution_type='g', distribution_args=[-0.00971, 0.131], convert_to_substitution=True, dominance_coeff_list=[0.0, 0.01, 0.1, 0.4], dominance_coeff_breaks=[-0.1, -0.01, -0.001]).distribution_args
E            +    and   [-0.0257, 0.131] = MutationType(dominance_coeff=None, distribution_type='g', distribution_args=[-0.0257, 0.131], convert_to_substitution=True, dominance_coeff_list=[0.0, 0.01, 0.1, 0.4], dominance_coeff_breaks=[-0.1, -0.01, -0.001]).distribution_args

This issue was flagged by @igronau in #1548:

The gamma shape was taken directly from the reported estimate.
The gamma mean was taken from the value reported in simulations. 
This was derived from the inferred shape and scale + the estimated Na, 
but since the estimated Na is not reported, I couldn't confirm the reported mean.

@ckyriazis, can you confirm/clarify?

ckyriazis commented 4 months ago

ah oops sorry about the directory issue! yes, this is what I was expecting - the gamma mean in the original PR was incorrect, -0.0257 is the correct value.

petrelharp commented 4 months ago

Hm, I do see that in the paper:

Selection coefficients (s) for deleterious mutations were drawn from a gamma
distribution estimated using the genomic data (see Inference of the distribution of selection
coefficients) with mean = -0.0257 and shape parameter = 0.1314. 

Thanks! I will fix this and merge; if @igronau disagrees, we can discuss and fix it up separately.

petrelharp commented 4 months ago

Ah, it looks like it was a case of just grabbing the wrong number - the number previously in the code was from the lower mutation rate:

For sensitivity analyses
varying the mutation rate, we used the parameters inferred under the respective mutation rate
(mean = -0.00971 and shape = 0.1316 for μ = 2.2x10-9 and mean = -0.0476 and shape = 0.1315
for μ = 1.08x10-8; fig. S13).