GENetic EStimation and Inference in Structured samples (GENESIS): Statistical methods for analyzing genetic data from samples with population structure and/or relatedness
In https://github.com/UW-GAC/analysis_pipeline assoc_plots, instead of hardcoding the threshold for p-value to 38.5, use .Machine$double.xmin to calculate the threshold that can be used. The current code is:
if ("stat" %in% names(assoc)) {
extreme <- abs(assoc$stat) > 38.5
if (any(extreme)) assoc$pval[extreme] <- 5e-324
}
5e-324 was probably chosen based on this information from ?.Machine:
Note that on most platforms smaller positive values than
‘.Machine$double.xmin’ can occur. On a typical R platform the
smallest positive double is about ‘5e-324’.
but it's most reliable to use the value in .Machine. The code would then change to:
if ("stat" %in% names(assoc)) {
extreme <- assoc$stat^2 > qchisq(.Machine$double.xmin, df=1, lower.tail=F)
if (any(extreme)) assoc$pval[extreme] <- .Machine$double.xmin
}
A p-value of 5e-324 and eg 2.225074e-308 (from .Machine$double.xmin on my machine) are basically the same, anyway.
Instead of making this change in analysis_pipeline, better to make it in GENESIS. Any function calling pchisq should then filter extremely low p-values using the above code.
In https://github.com/UW-GAC/analysis_pipeline assoc_plots, instead of hardcoding the threshold for p-value to 38.5, use
.Machine$double.xmin
to calculate the threshold that can be used. The current code is:5e-324 was probably chosen based on this information from ?.Machine:
but it's most reliable to use the value in .Machine. The code would then change to:
A p-value of 5e-324 and eg 2.225074e-308 (from .Machine$double.xmin on my machine) are basically the same, anyway.