lh3 / psmc

Implementation of the Pairwise Sequentially Markovian Coalescent (PSMC) model
Other
146 stars 60 forks source link

Atomic time interval parameters #35

Open plubbe opened 3 years ago

plubbe commented 3 years ago

Hi team -

Have been trying out some different parameters for the -N, -t, and -p flags (based on published materials), but I'm having a hard time interpreting the differences between the outputs, which are significant. For example, here are two outputs, done using the exact same data, with the only difference being the time interval flags:

Screen Shot 2021-07-27 at 2 29 32 PM

Left: psmc -N25 -t15 -r5 -p "4+25*2+4+6" -o ${resultsdir}${base}.psmc ${resultsdir}${base}.psmcfa

Right: psmc -N30 -t5 -r5 -p "4+30*2+4+6+10" -o ${resultsdir}${base}.psmc ${resultsdir}${base}.psmcfa

I find it really difficult to interpret these graphs, or know which is more appropriate. What metrics does one use to choose the parameters? The manual says to choose them such that :

after 20 rounds of iterations, at least ~10 recombinations are inferred to occur in the intervals each parameter spans

but to be honest - I'm not sure how I'm meant to tell! Any help or thoughts are appreciated :)

EDIT: formatting

plubbe commented 3 years ago

I've thought about this for a little while now. I have a feeling it is the -t flag which is causing the great difference in the graphs. I think the portion of the left-side graph past 10^6 is just truncated compared to the right-side graph, and it is essentially only showing the rise in Ne between 10^5 and 10^6, and not the drop before 1 MYA. Is my intuition right? And if so, how should I be choosing the value for -t? I am quite interested in any portion of the graph that can be calculated beyond 10^6 - but how accurate will it be beyond that time? I don't want to overinflate the value, either.

linslsf commented 2 years ago

I am interested in learning how to best choose these (-t and -p) parameters too.

CaprimulgusG commented 1 year ago

Hi @plubbe I don't suppose you got anywhere with parameter selection? I too have been trawling through the literature but parameter selection still appears to be somewhat of a dark art!