Support for higher precision p-values, new build & test frameworks

welchr commented 2 years ago

Purpose

This PR adds support for higher precision p-values during stepwise conditional analysis. Internally, the p-values are stored in log scale, and sorting is done on the log scale p-values. The p-values written out are at arbitrary precision in the original scale, or they can be written in log scale with --print-log-pvals.

PR also includes a previous branch (from maybe a year ago) that incorporated CMake and cget for running builds. It made it easier to incorporate GoogleTest for test cases, and Rmath for the various stats functions in log scale. The build system changes also make it possible to compile on MacOS and develop/debug with CLion.

A new --legacy-vcov option was added to support the previous vcov format. This way only one code branch needs to be maintained. I couldn't think of a good way to detect the old format by examining the file, rather the analyst must know which format they are using.

Finally, there is a Dockerfile added that allows for building container images. These are useful for running test cases and compiling the distributable linux static binary in a clean environment, and for remote development with CLion or VS Code. They could also potentially be included in GH's container registry, for people to use without having to run the distributed binary, but that is not included in this PR.

New binary is attached here:

apex-0.2-alpha-124-4e979d86c38-x86_64-unknown-linux-gnu.tar.gz

Not all of these changes need to be kept in this PR of course. If there's anything I should change or pare out for a separate PR, let me know.

Changes

Backward incompatible changes

None

New features

Option --print-log-pvals now in apex meta for writing p-values in log scale (natural log)
New --legacy-vcov arg for supporting older vcov files generated prior to Nov 15th, 2020
Added build support for CMake/cget
Added GoogleTest framework
- Test cases created for various parts of program modified due to new features and bug fixes
Added support for building docker images
- bin/docker_build_image.sh builds the docker image
- bin/make_linux.sh uses the docker image to create the static linux binary & tarball
Support for building and compiling on MacOS

Bug fixes

Fixed p-values of 0 (underflow due to double precision) in output for:
- apex meta --stepwise --use-marginal-pval
- apex meta --stepwise --het (both with or without marginal pvalue mode)
- apex meta --meta
- apex cis (for OLS only)
Fixed rare edge case where covariance sign flips were sometimes missed in meta --stepwise analysis (the --stepwise --het analysis did not have this issue)
Program in rare cases could crash when trying to retrieve list of chromosome/contig names from genotype file (9c3eee76cd34fd80cfe5ec02d3529aace5cb54c0)

welchr commented 2 years ago

Marking as draft for now - going to try and additionally patch conditional_analysis_het(...).

welchr commented 1 year ago

PR moved to https://github.com/lin-lab/apex/pull/1.

corbinq / apex