omerwe / polyfun

PolyFun (POLYgenic FUNctionally-informed fine-mapping)
MIT License
96 stars 22 forks source link

New flag to **not** sort fine-mapping results by PIP #93

Closed jdblischak closed 2 years ago

jdblischak commented 2 years ago

This PR addresses Issue #88 to make it easier to merge PolyFun results with SuSiE R object exported via --susie-outfile (https://github.com/omerwe/polyfun/issues/31#issuecomment-672239786). The new flag is called --no-sort-pip.

As I was implementing this new flag, I also added an automated test for the FINEMAP integration and re-organized the command-line flags.

Testing

Using the new flag --no-sort-pip, the PolyFun results are not sorted, and thus the PIP and CREDIBLE_SET columns are in the same order as the SuSiE object. This facilitates assigning SNP names to the SuSiE object.

python finemapper.py \
  --geno example_data/chr1 \
  --sumstats example_data/chr1.finemap_sumstats.txt.gz \
  --n 383290 \
  --chr 1 \
  --start 46000001 \
  --end 49000001 \
  --method susie \
  --max-num-causal 5 \
  --threads 1 \
  --verbose \
  --susie-outfile test.rds \
  --no-sort-pip \
  --out test.gz
library("susieR")

s <- readRDS("test.rds")
susie_get_pip(s)
##  [1] 1.00000000 0.06923779 0.01884478 0.25673413 0.38342421 0.97377487
##  [7] 0.06712863 0.99786140 0.12521417 0.01795261 0.16487797 0.12168893
## [13] 0.07373468 0.39374242 0.04849721
s$sets$cs
## $L1
## [1] 1
##
## $L2
## [1] 8
##
## $L3
## [1] 6

pf <- read.delim("test.gz")
pf$PIP
##  [1] 1.0000000 0.0692378 0.0188448 0.2567340 0.3834240 0.9737750 0.0671286
##  [8] 0.9978610 0.1252140 0.0179526 0.1648780 0.1216890 0.0737347 0.3937420
## [15] 0.0484972
pf$CREDIBLE_SET
##  [1] 1 0 0 0 0 3 0 2 0 0 0 0 0 0 0
which(pf$CREDIBLE_SET == 1)
## [1] 1
which(pf$CREDIBLE_SET == 2)
## [1] 8
which(pf$CREDIBLE_SET == 3)
## [1] 6

cor(s$pip, pf$PIP)
## [1] 1
all.equal(s$pip, pf$PIP)
## [1] "Mean relative difference: 4.471441e-07"

The default behavior is unchanged. The results will be sorted by PIP. However, a warning is logged if a user specifies --susie-outfile but not --no-sort-pip.

python finemapper.py \
  --geno example_data/chr1 \
  --sumstats example_data/chr1.finemap_sumstats.txt.gz \
  --n 383290 \
  --chr 1 \
  --start 46000001 \
  --end 49000001 \
  --method susie \
  --max-num-causal 5 \
  --threads 1 \
  --verbose \
  --susie-outfile test.rds \
  --out test-default.gz
## [WARNING]  --susie-outfile was set but not --no-sort-pip. This will make it difficult to assign SNP names to the SuSiE R object

zcat test-default.gz | cut -f10,14 | head -n 4
## PIP  CREDIBLE_SET
## 1.00000e+00  1
## 9.97861e-01  2
## 9.73775e-01  3
omerwe commented 2 years ago

This is truly awesome, thanks @jdblischak!