Suggestion: Change to p_emp calculation in compare_fits()

Hi Robert, I've been working on some admixture graph inference using find_graphs() following the suggested protocol in your eLife paper. In step 2, while comparing some of the best-fitting graphs at a fixed n_admix using the bootstrap test in compare_fits(), I was curious to see how the bootstrap p-value was calculated, and saw your note about truncating the p-value to avoid p_emp==0.

My suggestion is to change the way that p_emp is evaluated to be consistent with Phipson & Smyth 2010, which is effectively to adjust R/qpgraph.R:695-696 to be:

frac1 = (sum(scorediff > 0) + 1)/(length(scorediff) + 1)
frac2 = (sum(scorediff < 0) + 1)/(length(scorediff) + 1)

As the paper says, these are exact p-values for Monte Carlo tests like the bootstrap that will never evaluate to 0. (Well, probably not exact due to the same argument as in section 5 of the paper, but at least closer to the appropriate size than the unbiased estimator that can evaluate to 0.)

Thanks for all your hard work on ADMIXTOOLS2! It's incredibly useful, and (no offense intended) so much easier to work with than ADMIXTOOLS 1.

Best regards, Patrick Reilly

uqrmaie1 / admixtools

Suggestion: Change to p_emp calculation in compare_fits() #75