dportik / dadi_pipeline

An accessible and flexible tool for fitting demographic models with dadi using custom or published models (available here), conducting goodness of fit tests, and plotting.
GNU Lesser General Public License v3.0
62 stars 30 forks source link

Fixing theta #11

Closed ldutoit closed 4 years ago

ldutoit commented 5 years ago

Dear Daniel,

I am trying to set up theta explicitly as explained by Ryan Gutenkunst here.

I defined an explicit function that has theta :

def growthfixedtheta ( params , ns , pts ):
    nu ,T, theta = params
    xx = dadi.Numerics.default_grid (pts)
    phi = dadi.PhiManip.phi_1D (xx)
    nu_func = lambda t: numpy.exp ( numpy.log (nu) * t/T)
    phi = dadi.Integration.one_pop (phi , xx , T, nu_func,theta0= theta  )
    fs = dadi.Spectrum.from_phi(phi, ns, (xx,))
    return fs

And then I modified optimise functions at 3 different lines.

line 118:

 theta =params_opt[2] #My fixed theta, instead of dadi.Inference.optimal_sfs_scaling(sim_model, fs)

at line 107 no multinom:

ll = dadi.Inference.ll(sim_model, fs) 

and at line 242, multinom =False:

params_opt = dadi.Inference.optimize_log_fmin(params_perturbed, fs, func_exec, pts, lower_bound=lower_bound, upper_bound=upper_bound, verbose=1, multinom=False,maxiter=maxiters_list[r], output_file = "{}.log.txt".format(model_name))

I am a tiny bit more confused because when I try to run it, the theta that I fix is always very slightly different from the theta I get back.

Could you tell me wether that seems correct or wether I missed something important? That would be so helpful! I understand if that is not the purpose of your code and you don't really have time to look into it tho.

If that seems appropriate I am happy to take the time to make it into options of your repos if you want it to be of course.

Kind Regards

Ludo

dportik commented 5 years ago

Hi Ludo, Somehow this managed to slip through the cracks! Sorry about that.

It may be that you want label your parameter as theta1, rather than just theta. There may be a default name that should be avoided. It also looks like Ryan included this in his model:

phi = dadi.PhiManip.phi_1D(xx, theta=2*theta1)

According to his annotation:

"Here theta is set to 2*theta1 because we define theta1 to be the scaled mutation rate corresponding to the reference population size, In this case, that is the size of population 1."

Whereas in your model, this was simply:

phi = dadi.PhiManip.phi_1D(xx)

Maybe this could be causing the observed differences?

The rest of the steps seem to match his recommendations.