omerwe / polyfun

PolyFun (POLYgenic FUNctionally-informed fine-mapping)
MIT License
85 stars 21 forks source link

Scaled prior variance should be no greater than 1 when standardize = TRUE #186

Closed Y-Isaac closed 5 months ago

Y-Isaac commented 5 months ago

HI,

I apologize for asking multiple times. When I attempt to use HESS to estimate the variance of causal effect sizes, most of the results appear normal, but some of them encounter the following error message: _"Error in init_setup(0, p, L, scaled_prior_variance, residualvariance, : Scaled prior variance should be no greater than 1 when standardize = TRUE." In fact, the same region can run normally using the built-in estimator in susie. Below is my complete log file, which I hope will be helpful to you:

[INFO] Loading sumstats file... [INFO] Loaded sumstats for 175455 SNPs in 1.46 seconds [INFO] cffi mode is CFFI_MODE.ANY [DEBUG] Looking for R home with: R RHOME [INFO] R home found: /public/software/anaconda3/envs/polyfun/lib/R [DEBUG] Looking for LD_LIBRARY_PATH with: /public/software/anaconda3/envs/polyfun/lib/R/bin/Rscript -e cat(Sys.getenv("LD_LIBRARY_PATH")) [INFO] R library path: /opt/gridview//pbs/dispatcher/lib::/usr/local/lib64:/usr/local/lib [INFO] LD_LIBRARY_PATH: /opt/gridview//pbs/dispatcher/lib::/usr/local/lib64:/usr/local/lib [DEBUG] cffi mode is InterfaceType.API [INFO] Default options to initialize R: rpy2, --quiet, --no-save [INFO] R is already initialized. No need to initialize. [INFO] Computing LD from plink fileset chr22 chromosome 22 region 23294808-26294808 Mapping files: 100%|██████████| 3/3 [00:02<00:00, 1.35it/s] [INFO] Found 15152 SNPs in target region. Computing LD in 4 chunks... 100%|██████████| 4/4 [01:44<00:00, 26.07s/it] [INFO] Done in 108.31 seconds [INFO] Flipping the effect-sign of 7975 SNPs that are flipped compared to the LD panel [INFO] Excluding SNPs with heritability less than 5.0000e-05 from the HESS estimation [INFO] Average local SNP heritability estimated by modified HESS over 100 iterations: 2.1743e+01 [WARNING] The HESS estimator is unconstrained, and the estimate is an order of magnitude greater than the expected max of 1. Use with caution [INFO] HESS estimated causal effect size variance: 2.1743e+00 [INFO] Starting functionally-informed SuSiE fine-mapping for chromosome 22 BP 23294808-26294808 (14765 SNPs) [INFO] Using susieR::susie_suff_stat() [WARNING] R[write to console]: For large R or large XtX, consider installing the Rfast package for better performance.

[WARNING] R[write to console]: Error in init_setup(0, p, L, scaled_prior_variance, residual_variance, : Scaled prior variance should be no greater than 1 when standardize = TRUE

Traceback (most recent call last): File "/public/home/P202306/software/polyfun/finemapper.py", line 1275, in df_finemap = finemap_obj.finemap(locus_start=args.start, locus_end=args.end, num_causal_snps=args.max_num_causal, File "/public/home/P202306/software/polyfun/finemapper.py", line 835, in finemap susie_obj = self.susieR.susie_suff_stat( File "/public/software/anaconda3/envs/polyfun/lib/python3.8/site-packages/rpy2/robjects/functions.py", line 208, in call return (super(SignatureTranslatedFunction, self) File "/public/software/anaconda3/envs/polyfun/lib/python3.8/site-packages/rpy2/robjects/functions.py", line 131, in call res = super(Function, self).call(*new_args, *new_kwargs) File "/public/software/anaconda3/envs/polyfun/lib/python3.8/site-packages/rpy2/rinterfacelib/conversion.py", line 45, in cdata = function(args, **kwargs) File "/public/software/anaconda3/envs/polyfun/lib/python3.8/site-packages/rpy2/rinterface.py", line 817, in call raise embedded.RRuntimeError(_rinterface._geterrmessage()) rpy2.rinterface_lib.embedded.RRuntimeError: Error in init_setup(0, p, L, scaled_prior_variance, residual_variance, : Scaled prior variance should be no greater than 1 when standardize = TRUE

omerwe commented 5 months ago

@Y-Isaac thanks for flagging this. As the warning message says, "The HESS estimator is unconstrained, and the estimate is an order of magnitude greater than the expected max of 1". The HESS estimator is a reasonable estimator in most cases, but like every estimator it has its pitfalls.

I can't fix this, but I updated the code to emit a clearer error message in this case. Sorry I can't provide more help --- hopefully this will help future users

Y-Isaac commented 5 months ago

@omerwe HI,

I see the update. I am considering whether it is possible to use the built-in estimator in susie for these erroneous loci, otherwise use HESS? Do you think it is reasonable to have two types of causal effect size estimators coexist in one analysis? Because I feel it's a pity to simply give up on these loci, this solution is a workaround.

omerwe commented 5 months ago

@Y-Isaac yes you can definitely do that. Overall the prior variance usually doesn't make a big difference in well-behaved loci, and my opinion is it's better to get results from all loci than giving up on some loci, as long as you remember that some of the loci requires extra scrutiny.

Y-Isaac commented 5 months ago

@omerwe You help me a lot! Thanks for your help sincerely, wish you have a nice day!