precimed / mixer

Causal Mixture Model for GWAS summary statistics
GNU General Public License v3.0
52 stars 16 forks source link

RuntimeError: runtime_error: arg <= 0 #71

Open 1999yangyi opened 1 year ago

1999yangyi commented 1 year ago

Does anyone know why this error occurs and how to fix it? Traceback (most recent call last): File "/home/lilab/software/mixer-master/precimed/mixer.py", line 23, in args.func(args) File "/home/lilab/software/mixer-master/precimed/mixer/cli.py", line 647, in execute_fit1_or_test1_parser results = init_results_struct(libbgmg, args) File "/home/lilab/software/mixer-master/precimed/mixer/cli.py", line 639, in init_results_struct results['options']['sum_weights'] = float(np.sum(libbgmg.weights)) File "/home/lilab/software/mixer-master/precimed/mixer/libbgmg.py", line 168, in weights return self._get_vec_impl(self.cdll.bgmg_retrieve_weights, np.float32, self.num_tag, trait=None) File "/home/lilab/software/mixer-master/precimed/mixer/libbgmg.py", line 410, in _get_vec_impl self._check_error(func(*args)) File "/home/lilab/software/mixer-master/precimed/mixer/libbgmg.py", line 418, in _check_error raise RuntimeError(self.get_last_error()) RuntimeError: runtime_error: arg <= 0

kylechengtn commented 11 months ago

Hi, I have been getting the exactly same error message as this. Have you had any luck trouble-shooting this? Your experience and help will be much appreciated!

alicebraun commented 5 months ago

Has anyone fixed this successfully yet? I'm having the same issue

ricanney commented 3 months ago

RuntimeError arg <=0

I encountered this issue when running the fit1 process (with the script bombing quite early (line 500ish).

20240414 16:27:53.839157     set_option(diag=0); 
20240414 16:27:53.839217     diag: num_snp_=24814321
20240414 16:27:53.839238     diag: num_tag_=0
20240414 16:27:53.839262     diag: LdMatrixCsr 23 chunks in total. Logging futher info for non-empty chunks only.
20240414 16:27:53.839289     diag: zvec1_.size()=0
20240414 16:27:53.839315     diag: zvec1_=[], nnz=0
20240414 16:27:53.839337     diag: nvec1_.size()=0
20240414 16:27:53.839358     diag: nvec1_=[], nnz=0
20240414 16:27:53.839379     diag: causalbetavec1_.size()=0
20240414 16:27:53.839401     diag: causalbetavec1_=[], nnz=0
20240414 16:27:53.839422     diag: zvec2_.size()=0
20240414 16:27:53.839444     diag: zvec2_=[], nnz=0
20240414 16:27:53.839465     diag: nvec2_.size()=0
20240414 16:27:53.839486     diag: nvec2_=[], nnz=0
20240414 16:27:53.839507     diag: causalbetavec2_.size()=0
20240414 16:27:53.839529     diag: causalbetavec2_=[], nnz=0
20240414 16:27:53.839549     diag: weights_.size()=0
20240414 16:27:53.839571     diag: weights_=[], nnz=0
20240414 16:27:53.839592     diag: mafvec_.size()=24814321
20240414 16:27:53.866042     diag: mafvec_=[1, 0.999006, 1, 0.893638, 0.999006, ...], nnz=24814321
20240414 16:27:53.866077     diag: options.k_max_=20000
20240414 16:27:53.866097     diag: options.use_complete_tag_indices_=0
20240414 16:27:53.866117     diag: options.disable_snp_to_tag_map_=0
20240414 16:27:53.866136     diag: options.max_causals_=100000
20240414 16:27:53.866155     diag: options.num_components_=1
20240414 16:27:53.866174     diag: options.r2_min_=0
20240414 16:27:53.866196     diag: options.z1max_=1e+10
20240414 16:27:53.866216     diag: options.z2max_=1e+10
20240414 16:27:53.866236     diag: options.cost_calculator_=0 (Sampling)
20240414 16:27:53.866259     diag: options.aux_option_=1 (Ezvec2)
20240414 16:27:53.866281     diag: options.cache_tag_r2sum_=no
20240414 16:27:53.866300     diag: options.seed_=123
20240414 16:27:53.866319     diag: options.cubature_abs_error_=0
20240414 16:27:53.866339     diag: options.cubature_rel_error_=1e-05
20240414 16:27:53.866359     diag: options.cubature_max_evals_=1000
20240414 16:27:53.866378     diag: options.calc_k_pdf_=0
20240414 16:27:53.866398     diag: options.ld_format_version_=-1
20240414 16:27:53.866417     diag: options.retrieve_ld_sum_type_=0
20240414 16:27:53.866436     diag: Estimated memory usage (total): 0 bytes
20240414 16:27:53.866597     retrieve_mafvec()
20240414 16:27:53.931135     retrieve_mafvec()
**20240414 16:27:54.108714   runtime_error:  arg <= 0**

There may be different issues with this error but I found a solution by performing some harmonisation of my reference data to the sumstats I was using. My initial (errored) approach was to create reference .ld and .snps from b37-1000g-eur binaries contained ~20M+ SNPs. My aim was to create a reusable set of reference files for use across multiple data. My "FAILED" analysis came when applying these to sumstats which contained only ~4M+ SNPs.

My thoughts was that theruntime_error arg <=0 error was due to the script using the reference data and snp-sets with no overlapping SNPs in the .snps and .sumstats file. In this scenario the reference provided 15M+ SNPs in the .ld and .snp reference data that were not present in my sumstats. If I understand the process correctly, the thinning process randomly (via a seed) selects SNPs that capture the variation in an LD block. So there is a possibility to have blocks with 0 SNPs. This was probably less of an issue when a w_hm3 dataset of ~1.3M SNPS.

My local pipeline initially included removal of ambiguous SNPs (gt = W|S), harmonising rsid, removal of regions of high-LD (from b37 from https://genome.sph.umich.edu/wiki/Regions_of_high_linkage_disequilibrium_(LD)). This was applied to the sumstats but not binaries used to create the reference files.

I re-wrote my local pipeline to select overlapping SNPs, and apply the processing to both the sumstats and the binaries before creating the .ld / .snps files. This means that any markers in the *.snps replicates will be in the sumstats file - removing the 15M+ non-overlapping SNPs (and hopefully we have SNPS present in each block).

...and now the script works :)