large-scale-gxe-methods / GEM

Other
10 stars 5 forks source link

Error in subprocessing #4

Closed shinichinamba closed 3 years ago

shinichinamba commented 3 years ago

Hello,

I am excited to use your software, but have having difficulties with getting it running. I got an error (segmentation fault) after GEM divided my BGEN file to blocks. I built GEM again with -g option and ran GEM in gdb and got the following message. Would you let me know how to deal with this problem?

Any help would be appreciated.

Here Here


Welcome to GEM v1.1 (C) 2018-2020 Liang Hong, Han Chen, Duy Pham GNU General Public License v3


...

Continuous or Binary? Continuous Robust or Non-Robust Analysis? Robust

...


Starting GWAS...

Precalculations and fitting null model... Execution time... 277 ms Done.


Streaming SNPs for speeding up GWAS analysis in parallel. Number of SNPs in each batch is: 20

Detected 40 available thread(s)... Using 5 for multithreading...

Dividing BGEN file into 5 block(s)... Execution time... 9s, 67 ms Done.


Running multithreading... [New Thread 0x7fffe7ba2700 (LWP 1538)] [New Thread 0x7fffe73a1700 (LWP 1539)] [New Thread 0x7fffe5250700 (LWP 1540)] [New Thread 0x7fffe4a4f700 (LWP 1541)] [New Thread 0x7fffdcd5f700 (LWP 1542)] Thread 1 finished in 10m, 5s, 912 ms

Thread 3 "GEM" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fffe73a1700 (LWP 1539)] boost::detail::(anonymous namespace)::tls_destructor (data=0x6e2f70) at libs/thread/src/pthread/thread.cpp:88 88 (*current_node->func)();

Output files were empty except for 1 temporary file.

total 24M -rwxrwxrwx 1 user user 0 Oct 26 16:26 chr10_10.txt_bin_0.tmp -rwxrwxrwx 1 user user 12M Oct 26 16:36 chr10_10.txt_bin_1.tmp -rwxrwxrwx 1 user user 0 Oct 26 16:26 chr10_10.txt_bin_2.tmp -rwxrwxrwx 1 user user 0 Oct 26 16:26 chr10_10.txt_bin_3.tmp -rwxrwxrwx 1 user user 0 Oct 26 16:26 chr10_10.txt_bin_4.tmp

When I used only 1 thread, I still got the same segmentation fault.

My environment is CentOS7, gcc 8.3.1, GNU make 4.2.1.

duytpm16 commented 3 years ago

Hello,

Can you provide some details on the processor used and boost version? Also, did the example dataset run ok?

shinichinamba commented 3 years ago

Thank you for your reply.

My processor is Intel Xeon Silver 4210 CPU @ 2.20GHz. I reproduced the error in another PC with Intel Xeon CPU E5-2687w v4 @ 3.00GHz

My boost version is 1.74.0.

Unfortunately, the example dataset also did not run (ended up with segmentation fault.) I attached the commands and outputs below.

(base) [conda@DESKTOP-IE1CMKO example]$ ../src/GEM --pheno-file example.pheno --bgen example.bgen --sampleid-name sampleid --pheno-name pheno2 --exposure-names cov1 --covar-names cov2 cov3 --pheno-type 1 --robust 1 --missing-value NaN Here Here


Welcome to GEM v1.1 (C) 2018-2020 Liang Hong, Han Chen, Duy Pham GNU General Public License v3


The Phenotype File is: example.pheno The Selected Phenotype is: pheno2 Continuous or Binary? Binary Robust or Non-Robust Analysis? Robust

The Total Number of Selected Covariates is: 2 The Selected Covariates are: cov2 cov3 No Interaction Covariates Selected The Total Number of Exposures is: 1 The Selected Exposures are: cov1

Logistic Convergence Threshold: 1e-07 Minor Allele Frequency Threshold: 0.001 Number of Threads: 20 Output File: gem.out


Before ID Matching and checking missing values... Size of the phenotype vector is: 500 X 1 Size of the selected covariate matrix (including first column for interception values) is: 500 X 4 End of reading phenotype and covariate data.


General information of BGEN file. Number of variants: 1000 Number of samples: 500 Genotype Block Compression Type: Zlib Layout: 2 Sample Identifiers Present: True


After processes of sample IDMatching and checking missing values, the sample size changes from 500 to 250.

Sample IDMatching and checking missing values processes have been completed. New pheno and covariate data vectors with the same order of sample ID sequence of geno data are updated.


Starting GWAS...

Precalculations and fitting null model... Logistic regression reaches convergence after 5 steps... Execution time... 467 ms Done.


Streaming SNPs for speeding up GWAS analysis in parallel. Number of SNPs in each batch is: 1

Detected 40 available thread(s)... Using 20 for multithreading...

Dividing BGEN file into 20 block(s)... Execution time... 662 ms Done.


Running multithreading... Thread 0 finished in 533 ms Segmentation fault (base) [conda@DESKTOP-IE1CMKO example]$ ls example.bgen gem.out_bin_0.tmp gem.out_bin_11.tmp gem.out_bin_13.tmp gem.out_bin_2.tmp gem.out_bin_4.tmp gem.out_bin_6.tmp gem.out_bin_8.tmp my_example.out example.pheno gem.out_bin_10.tmp gem.out_bin_12.tmp gem.out_bin_1.tmp gem.out_bin_3.tmp gem.out_bin_5.tmp gem.out_bin_7.tmp gem.out_bin_9.tmp (base) [conda@DESKTOP-IE1CMKO example]$ ls -lh total 376K -rwxrwxr-x 1 conda conda 241K Oct 22 11:53 example.bgen -rwxrwxr-x 1 conda conda 16K Oct 22 11:53 example.pheno -rw-rw-r-- 1 conda conda 4.6K Oct 27 12:20 gem.out_bin_0.tmp -rw-rw-r-- 1 conda conda 0 Oct 27 12:20 gem.out_bin_10.tmp -rw-rw-r-- 1 conda conda 0 Oct 27 12:20 gem.out_bin_11.tmp -rw-rw-r-- 1 conda conda 0 Oct 27 12:20 gem.out_bin_12.tmp -rw-rw-r-- 1 conda conda 0 Oct 27 12:20 gem.out_bin_13.tmp -rw-rw-r-- 1 conda conda 4.6K Oct 27 12:20 gem.out_bin_1.tmp -rw-rw-r-- 1 conda conda 4.8K Oct 27 12:20 gem.out_bin_2.tmp -rw-rw-r-- 1 conda conda 0 Oct 27 12:20 gem.out_bin_3.tmp -rw-rw-r-- 1 conda conda 0 Oct 27 12:20 gem.out_bin_4.tmp -rw-rw-r-- 1 conda conda 0 Oct 27 12:20 gem.out_bin_5.tmp -rw-rw-r-- 1 conda conda 0 Oct 27 12:20 gem.out_bin_6.tmp -rw-rw-r-- 1 conda conda 0 Oct 27 12:20 gem.out_bin_7.tmp -rw-rw-r-- 1 conda conda 0 Oct 27 12:20 gem.out_bin_8.tmp -rw-rw-r-- 1 conda conda 0 Oct 27 12:20 gem.out_bin_9.tmp -rwxrwxr-x 1 conda conda 92K Oct 22 11:53 my_example.out (base) [conda@DESKTOP-IE1CMKO example]$ wc -l .tmp 50 gem.out_bin_0.tmp 0 gem.out_bin_10.tmp 0 gem.out_bin_11.tmp 0 gem.out_bin_12.tmp 0 gem.out_bin_13.tmp 50 gem.out_bin_1.tmp 50 gem.out_bin_2.tmp 0 gem.out_bin_3.tmp 0 gem.out_bin_4.tmp 0 gem.out_bin_5.tmp 0 gem.out_bin_6.tmp 0 gem.out_bin_7.tmp 0 gem.out_bin_8.tmp 0 gem.out_bin_9.tmp 150 total (base) [conda@DESKTOP-IE1CMKO example]$ head -n 3 .tmp ==> gem.out_bin_0.tmp <== SNP1 rs1 1 A C 250 0.366 -0.163631 0.0361341 0.0884386 0.177135 0.389344 0.833566 0.675318 SNP2 rs2 2 A C 250 0.4 0.108896 0.0308642 0.182689 0.135988 0.535359 0.620314 0.729922 SNP3 rs3 3 A C 250 0.372 -0.319095 0.0347185 -0.164599 0.161752 0.0867977 0.682347 0.212218

==> gem.out_bin_10.tmp <==

==> gem.out_bin_11.tmp <==

==> gem.out_bin_12.tmp <==

==> gem.out_bin_13.tmp <==

==> gem.out_bin_1.tmp <== SNP51 rs51 51 A C 250 0.368 0.160221 0.0334939 -0.46305 0.153878 0.381323 0.237831 0.339623 SNP52 rs52 52 A C 250 0.344 -0.126619 0.0339306 0.23611 0.154734 0.491837 0.548349 0.65942 SNP53 rs53 53 A C 250 0.196 0.194466 0.0517125 -0.440971 0.232424 0.392465 0.36036 0.456592

==> gem.out_bin_2.tmp <== SNP101 rs101 101 A C 250 0.396 -0.00532816 0.0333808 -0.418959 0.149563 0.976735 0.278664 0.555869 SNP102 rs102 102 A C 250 0.38 0.0898783 0.0319697 -0.15199 0.163362 0.615194 0.706884 0.821154 SNP103 rs103 103 A C 250 0.162 0.356751 0.0620883 0.653157 0.287526 0.152222 0.22319 0.17088

==> gem.out_bin_3.tmp <==

==> gem.out_bin_4.tmp <==

==> gem.out_bin_5.tmp <==

==> gem.out_bin_6.tmp <==

==> gem.out_bin_7.tmp <==

==> gem.out_bin_8.tmp <==

==> gem.out_bin_9.tmp <==

duytpm16 commented 3 years ago

Thank you very much for the information.

I can't seem to reproduce the error. I am also testing on several processors including Intel Xeon CPU E5-2687w v4 @ 3.00GHz (with SUSE linux) and additional tests with the centos7 docker image.

Are you able to compile GEM from the dev branch and run with single thread? If this works, then I think there is a bug with the boost_thread library if everything in your makefile is exactly the same on the master branch.

git clone https://github.com/large-scale-gxe-methods/GEM cd GEM/src/ git checkout dev make

shinichinamba commented 3 years ago

Thank you for your prompt reply.

The dev branch version of GEM successfully ran with single thread.

I investigated the cause, and found that there was the second boost library in /usr/lib in my computer. I suspected my manually-installed boost library might have conflicted to this boost library and caused the error. Indeed, after I reinstalled the boost library and specified it with -L option in makefile, GEM successfully ran with multi-thread.

I appreciate all your help, and thank you again.