uqrmaie1 / admixtools

https://uqrmaie1.github.io/admixtools
71 stars 14 forks source link

Protocol Advice: Establishing a t hresholds for the number of allowed admixture events #56

Open santiago1234 opened 8 months ago

santiago1234 commented 8 months ago

First off, I just want to say a huge thanks for creating such an awesome tool. I thoroughly enjoyed the ADMIXTOOLS paper published in eLife and found it very informative.

My query pertains specifically to the protocol detailed in the paper, particularly the first point on Initial Scanning and Complexity Class Determination. The paper mentions: "The smallest number of admixture events that yields models with a (negative) LL score or an f-statistic residual lower than a certain threshold should be further investigated by running additional iterations of findGraphs." I'm interested in understanding how one should determine this threshold. Could you offer any advice or guidance on setting an effective threshold for the log-likelihood (LL) score to ascertain the minimum number of admixture events?

As an example, in the attached plot, I have executed find_graphs with 8 populations and varying admixture events ranging from 0 to 6, where the y-axis represents the score for each graph.

plot

I appreciate any insights you could provide.

uqrmaie1 commented 7 months ago

I'm glad that you find it useful!

I'm interested in understanding how one should determine this threshold.

That's a good question, and unfortunately I don't have a great answer to it, but I can say a few thing that are hopefully helpful:

To get the most out of these methods, it is necessary to develop a lot of experience by applying them to many different kinds of data, and to approach the model fitting process with a very critical mindset. My concern about any static protocol with fixed thresholds is that it might give people the false impression that following the protocol can substitute for looking at the data from 100 different angles with a very critical lens. At the same time, I see it as a big problem that the model fitting process is not a lot more objective, transparent, and accessible.

To get back to your question, instead of using the qpgraph LL-score to decide how many admixture events to include, you could use the more interpretable worst residual z-score.

theo-atkinson commented 5 months ago

Hi there,

To add to the advice requested above, how should the worst residual scores at each complexity class be compared? More specifically, should the mean scores be compared across complexity classes, or the scores of the best fitting graphs only?

My intuition would suggest that the scores of the best fitting graphs (lowest worst residuals) at each complexity class is more interpretable as it demonstrates the possibility for a graph topology to fit the data to a certain level.

In addition, how should a complexity class be interpretated if a large number of graphs equally achieve the best fit (and therefore same topology)? Would this suggest a complexity class that fits the data well, or perhaps that this complexity class is constraining potentially better fitting topologies?

Any advice would be appreciated. Thanks for the amazing tool!

uqrmaie1 commented 5 months ago

should the mean scores be compared across complexity classes, or the scores of the best fitting graphs only

It makes more sense to look at the scores of the best fitting graphs only. The majority of the returned models are just random stepping stones along to way to topologies with a better fit.

how should a complexity class be interpretated if a large number of graphs equally achieve the best fit?

If a large number of different graph (with different topologies) achieve similar fits, it is an indication that there is not enough data to disambiguate between these different models. In that case it could help to fit simpler models (with fewer admixture events), until you get to a point where the best-fitting models all share a similar topology.

But it sounds like you have something else in mind since you ask about a large number of graphs with the same topology. Are you running multiple iterations of find_graphs(), and you get the same best graph (with the same topology) in each iteration? If so, that would make it more likely that this graph is a good model, in particular if the next-best graphs with a different topology have a significantly worse fit.

If the best graphs all have a similar score or worst residual, but the scores or worst residuals indicate that the fit is poor, then you could increase the number of admixture events.

I haven't seen examples with more than 3 or 4 admixture events where there is a clearly best fitting model, and no substantially different model can be found that has a similar score or worst residual, so I think it's a good idea to stick to fewer admixture events. However, it's possible, and probably common, that for the populations that you study, any model with a small number of admixture events is oversimplifying, which may result in a poor fit.

As you increase the number of admixture events, there isn't always a point where you go from no well-fitting model, to getting a single well-fitting model. Instead you get multiple, very different models with a good fit. This suggests that there isn't enough data to fit an accurate admixture graph to these populations.