gitter-lab / SINGE

Gene regulatory network reconstruction from pseudotemporal single-cell gene expression data
MIT License
11 stars 6 forks source link

Glmnet mex errors for large datasets #32

Open atuldeshpande opened 5 years ago

atuldeshpande commented 5 years ago

We see a significant percentage of the jobs resulting in a segmentation violation when glmnet is called with the params.family = 'poisson' option (example at bottom). We may need to involve the glmnet maintainer at Stanford at some point.

A temporary workaround would be to transform count-based transcriptomic data and using the params.family = 'gaussian' option.


   Segmentation violation detected at Tue Jul 02 17:21:13 2019 -0500

Configuration: Crash Decoding : Disabled - No sandbox or build area path Crash Mode : continue (default) Default Encoding : US-ASCII Deployed : true GNU C Library : 2.17 stable Graphics Driver : Unknown software MATLAB Architecture : glnxa64 MATLAB Entitlement ID : Unknown MATLAB Root : /var/lib/condor/execute/slot1/dir_24929/v94 MATLAB Version : 9.4.0.813654 (R2018a) OpenGL : software Operating System : Linux 5.0.8-1.el7.elrepo.x86_64 #1 SMP Wed Apr 17 10:11:44 EDT 2019 x86_64 Process ID : 25220 Processor ID : x86 Family 6 Model 23 Stepping 10, GenuineIntel Session Key : 8f018b8e-68d7-47a0-bf6d-26a0bf86030f Static TLS mitigation : Disabled: Unable to open display Window System : No active display

Fault Count: 1

Abnormal termination

Register State (from fault): RAX = 0000000000003b16 RBX = 0000150d175d84a0 RCX = 0000150d175c9840 RDX = ffffffffc767d558 RSP = 0000150d13ffbe00 RBP = 0000150d13ffbf60 RSI = 0000150d17622240 RDI = 000000003fefffff

R8 = 0000000000000006 R9 = 0000000000000000 R10 = 0000150d17cf8830 R11 = 0000000000000005 R12 = 000000003ff00000 R13 = 0000000000000012 R14 = 0000000000000000 R15 = 0000150d17622250

RIP = 0000150cde6ea912 EFL = 0000000000010202

CS = 0033 FS = 0000 GS = 0000

Stack Trace (from fault): [ 0] 0x0000150cde6ea912 /tmp/.mcrCache9.4/GLG_In0/glmnet_matlab/glmnetMex.mexa64+00252178 [ 1] 0x0000150cde6b566e /tmp/.mcrCache9.4/GLG_In0/glmnetmatlab/glmnetMex.mexa64+00034414 mexfunction+00030329 [ 2] 0x0000150d2d2721ea bin/glnxa64/libmex.so+00414186 [ 3] 0x0000150d2d272447 bin/glnxa64/libmex.so+00414791 [ 4] 0x0000150d2d272f2b bin/glnxa64/libmex.so+00417579 [ 5] 0x0000150d2d25d30c bin/glnxa64/libmex.so+00328460 [ 6] 0x0000150d2edb52ad bin/glnxa64/libmwm_dispatcher.so+00979629 _ZN8Mfh_file16dispatch_fh_implEMS_FviPP11mxArray_tagiS2_EiS2iS2+00000829 [ 7] 0x0000150d2edb5bae bin/glnxa64/libmwm_dispatcher.so+00981934 _ZN8Mfh_file11dispatch_fhEiPP11mxArraytagiS2+00000030 [ 8] 0x0000150d29ad7da1 bin/glnxa64/libmwm_lxe.so+12619169 [ 9] 0x0000150d29ad8982 bin/glnxa64/libmwm_lxe.so+12622210 [ 10] 0x0000150d29bc0e79 bin/glnxa64/libmwm_lxe.so+13573753 [ 11] 0x0000150d29b623e1 bin/glnxa64/libmwm_lxe.so+13186017 [ 12] 0x0000150d293685a8 bin/glnxa64/libmwm_lxe.so+04822440 [ 13] 0x0000150d2936acbc bin/glnxa64/libmwm_lxe.so+04832444 [ 14] 0x0000150d2936701d bin/glnxa64/libmwm_lxe.so+04816925 [ 15] 0x0000150d29360ba1 bin/glnxa64/libmwm_lxe.so+04791201 [ 16] 0x0000150d29360dd9 bin/glnxa64/libmwm_lxe.so+04791769 [ 17] 0x0000150d29366846 bin/glnxa64/libmwm_lxe.so+04814918 [ 18] 0x0000150d2936692f bin/glnxa64/libmwm_lxe.so+04815151 [ 19] 0x0000150d29495503 bin/glnxa64/libmwm_lxe.so+06055171 [ 20] 0x0000150d29498cf3 bin/glnxa64/libmwm_lxe.so+06069491 [ 21] 0x0000150d299a8f6d bin/glnxa64/libmwm_lxe.so+11378541 [ 22] 0x0000150d29ac57c4 bin/glnxa64/libmwm_lxe.so+12543940 [ 23] 0x0000150d29ac5d6b bin/glnxa64/libmwm_lxe.so+12545387 [ 24] 0x0000150d2edb52ad bin/glnxa64/libmwm_dispatcher.so+00979629 _ZN8Mfh_file16dispatch_fh_implEMS_FviPP11mxArray_tagiS2_EiS2iS2+00000829 [ 25] 0x0000150d2edb5bde bin/glnxa64/libmwm_dispatcher.so+00981982 _ZN8Mfh_file22dispatch_fh_with_reuseEiPP11mxArraytagiS2+00000030 [ 26] 0x0000150d29be5d4e bin/glnxa64/libmwm_lxe.so+13725006 [ 27] 0x0000150d29955416 bin/glnxa64/libmwm_lxe.so+11035670 [ 28] 0x0000150d2995558c bin/glnxa64/libmwm_lxe.so+11036044 [ 29] 0x0000150d299eaae8 bin/glnxa64/libmwm_lxe.so+11647720 [ 30] 0x0000150d299ec229 bin/glnxa64/libmwm_lxe.so+11653673 [ 31] 0x0000150d2ea14f80 bin/glnxa64/libmwm_interpreter.so+00688000 _Z44inCallFcnWithTrapInDesiredWSAndPublishEventsiPP11mxArray_tagiS1_PKcbP15inWorkSpace_tag+00000080 [ 32] 0x0000150d2d7b586d bin/glnxa64/libmwiqm.so+00768109 _ZN3iqm15BaseFEvalPlugin7executeEP15inWorkSpace_tagRN7mwboost10shared_ptrIN14cmddistributor17IIPCompletedEventEEE+00000525 [ 33] 0x0000150d301f14a1 bin/glnxa64/libmwmcr.so+00849057 [ 34] 0x0000150d2d7abab1 bin/glnxa64/libmwiqm.so+00727729 [ 35] 0x0000150d2d78ea95 bin/glnxa64/libmwiqm.so+00608917 [ 36] 0x0000150d301bffe5 bin/glnxa64/libmwmcr.so+00647141 [ 37] 0x0000150d301c06a4 bin/glnxa64/libmwmcr.so+00648868 [ 38] 0x0000150d301b93f1 bin/glnxa64/libmwmcr.so+00619505 [ 39] 0x0000150d37574dd5 /lib64/libpthread.so.0+00032213 [ 40] 0x0000150d3a1eaead /lib64/libc.so.6+01040045 clone+00000109 [ 41] 0x0000000000000000 +00000000

This error was detected while a MEX-file was running. If the MEX-file is not an official MathWorks function, please examine its source code for errors. Please consult the External Interfaces Guide for information on debugging MEX-files. This crash report has been saved to disk as /tmp/matlab_crash_dump.25220-1

MATLAB is exiting because of fatal error /var/lib/condor/execute/slot1/dir_24929/condor_exec.exe: line 40: 25220 Killed "/var/lib/condor/execute/slot1/dir_24929/GLG_Instance" "X_BMSparse" "lambda" "[0.01,0.02,0.05,0.1,0.001]" "dT" "3" "num_lags" "5" "kernel_width" ".5" "ID" "0" "replicate" "1" "family" "poisson" "date" "07/02/2019" "firsttarget" "1" "targetincr" "300" "prob_remove_samples" "0.3" "prob_zero_removal" "0.6" [adeshpande4@submit-1 GitterNew]$ cat logs/hello-chtc_7936564_0.err


   Segmentation violation detected at Tue Jul 02 17:21:13 2019 -0500

Configuration: Crash Decoding : Disabled - No sandbox or build area path Crash Mode : continue (default) Default Encoding : US-ASCII Deployed : true GNU C Library : 2.17 stable Graphics Driver : Unknown software MATLAB Architecture : glnxa64 MATLAB Entitlement ID : Unknown MATLAB Root : /var/lib/condor/execute/slot1/dir_24929/v94 MATLAB Version : 9.4.0.813654 (R2018a) OpenGL : software Operating System : Linux 5.0.8-1.el7.elrepo.x86_64 #1 SMP Wed Apr 17 10:11:44 EDT 2019 x86_64 Process ID : 25220 Processor ID : x86 Family 6 Model 23 Stepping 10, GenuineIntel Session Key : 8f018b8e-68d7-47a0-bf6d-26a0bf86030f Static TLS mitigation : Disabled: Unable to open display Window System : No active display

Fault Count: 1

Abnormal termination

Register State (from fault): RAX = 0000000000003b16 RBX = 0000150d175d84a0 RCX = 0000150d175c9840 RDX = ffffffffc767d558 RSP = 0000150d13ffbe00 RBP = 0000150d13ffbf60 RSI = 0000150d17622240 RDI = 000000003fefffff

R8 = 0000000000000006 R9 = 0000000000000000 R10 = 0000150d17cf8830 R11 = 0000000000000005 R12 = 000000003ff00000 R13 = 0000000000000012 R14 = 0000000000000000 R15 = 0000150d17622250

RIP = 0000150cde6ea912 EFL = 0000000000010202

CS = 0033 FS = 0000 GS = 0000

Stack Trace (from fault): [ 0] 0x0000150cde6ea912 /tmp/.mcrCache9.4/GLG_In0/glmnet_matlab/glmnetMex.mexa64+00252178 [ 1] 0x0000150cde6b566e /tmp/.mcrCache9.4/GLG_In0/glmnetmatlab/glmnetMex.mexa64+00034414 mexfunction+00030329 [ 2] 0x0000150d2d2721ea bin/glnxa64/libmex.so+00414186 [ 3] 0x0000150d2d272447 bin/glnxa64/libmex.so+00414791 [ 4] 0x0000150d2d272f2b bin/glnxa64/libmex.so+00417579 [ 5] 0x0000150d2d25d30c bin/glnxa64/libmex.so+00328460 [ 6] 0x0000150d2edb52ad bin/glnxa64/libmwm_dispatcher.so+00979629 _ZN8Mfh_file16dispatch_fh_implEMS_FviPP11mxArray_tagiS2_EiS2iS2+00000829 [ 7] 0x0000150d2edb5bae bin/glnxa64/libmwm_dispatcher.so+00981934 _ZN8Mfh_file11dispatch_fhEiPP11mxArraytagiS2+00000030 [ 8] 0x0000150d29ad7da1 bin/glnxa64/libmwm_lxe.so+12619169 [ 9] 0x0000150d29ad8982 bin/glnxa64/libmwm_lxe.so+12622210 [ 10] 0x0000150d29bc0e79 bin/glnxa64/libmwm_lxe.so+13573753 [ 11] 0x0000150d29b623e1 bin/glnxa64/libmwm_lxe.so+13186017 [ 12] 0x0000150d293685a8 bin/glnxa64/libmwm_lxe.so+04822440 [ 13] 0x0000150d2936acbc bin/glnxa64/libmwm_lxe.so+04832444 [ 14] 0x0000150d2936701d bin/glnxa64/libmwm_lxe.so+04816925 [ 15] 0x0000150d29360ba1 bin/glnxa64/libmwm_lxe.so+04791201 [ 16] 0x0000150d29360dd9 bin/glnxa64/libmwm_lxe.so+04791769 [ 17] 0x0000150d29366846 bin/glnxa64/libmwm_lxe.so+04814918 [ 18] 0x0000150d2936692f bin/glnxa64/libmwm_lxe.so+04815151 [ 19] 0x0000150d29495503 bin/glnxa64/libmwm_lxe.so+06055171 [ 20] 0x0000150d29498cf3 bin/glnxa64/libmwm_lxe.so+06069491 [ 21] 0x0000150d299a8f6d bin/glnxa64/libmwm_lxe.so+11378541 [ 22] 0x0000150d29ac57c4 bin/glnxa64/libmwm_lxe.so+12543940 [ 23] 0x0000150d29ac5d6b bin/glnxa64/libmwm_lxe.so+12545387 [ 24] 0x0000150d2edb52ad bin/glnxa64/libmwm_dispatcher.so+00979629 _ZN8Mfh_file16dispatch_fh_implEMS_FviPP11mxArray_tagiS2_EiS2iS2+00000829 [ 25] 0x0000150d2edb5bde bin/glnxa64/libmwm_dispatcher.so+00981982 _ZN8Mfh_file22dispatch_fh_with_reuseEiPP11mxArraytagiS2+00000030 [ 26] 0x0000150d29be5d4e bin/glnxa64/libmwm_lxe.so+13725006 [ 27] 0x0000150d29955416 bin/glnxa64/libmwm_lxe.so+11035670 [ 28] 0x0000150d2995558c bin/glnxa64/libmwm_lxe.so+11036044 [ 29] 0x0000150d299eaae8 bin/glnxa64/libmwm_lxe.so+11647720 [ 30] 0x0000150d299ec229 bin/glnxa64/libmwm_lxe.so+11653673 [ 31] 0x0000150d2ea14f80 bin/glnxa64/libmwm_interpreter.so+00688000 _Z44inCallFcnWithTrapInDesiredWSAndPublishEventsiPP11mxArray_tagiS1_PKcbP15inWorkSpace_tag+00000080 [ 32] 0x0000150d2d7b586d bin/glnxa64/libmwiqm.so+00768109 _ZN3iqm15BaseFEvalPlugin7executeEP15inWorkSpace_tagRN7mwboost10shared_ptrIN14cmddistributor17IIPCompletedEventEEE+00000525 [ 33] 0x0000150d301f14a1 bin/glnxa64/libmwmcr.so+00849057 [ 34] 0x0000150d2d7abab1 bin/glnxa64/libmwiqm.so+00727729 [ 35] 0x0000150d2d78ea95 bin/glnxa64/libmwiqm.so+00608917 [ 36] 0x0000150d301bffe5 bin/glnxa64/libmwmcr.so+00647141 [ 37] 0x0000150d301c06a4 bin/glnxa64/libmwmcr.so+00648868 [ 38] 0x0000150d301b93f1 bin/glnxa64/libmwmcr.so+00619505 [ 39] 0x0000150d37574dd5 /lib64/libpthread.so.0+00032213 [ 40] 0x0000150d3a1eaead /lib64/libc.so.6+01040045 clone+00000109 [ 41] 0x0000000000000000 +00000000

This error was detected while a MEX-file was running. If the MEX-file is not an official MathWorks function, please examine its source code for errors. Please consult the External Interfaces Guide for information on debugging MEX-files. This crash report has been saved to disk as /tmp/matlab_crash_dump.25220-1

MATLAB is exiting because of fatal error /var/lib/condor/execute/slot1/dir_24929/condor_exec.exe: line 40: 25220 Killed "/var/lib/condor/execute/slot1/dir_24929/GLG_Instance" "X_BMSparse" "lambda" "[0.01,0.02,0.05,0.1,0.001]" "dT" "3" "num_lags" "5" "kernel_width" ".5" "ID" "0" "replicate" "1" "family" "poisson" "date" "07/02/2019" "firsttarget" "1" "targetincr" "300" "prob_remove_samples" "0.3" "prob_zero_removal" "0.6"

agitter commented 4 years ago

I also observed a segmentation violation when running SINGE_Example.m in MATLAB R2020a on macOS. I had called SINGE from the interactive MATLAB prompt. We are exploring whether exiting MATLAB in between calls to SINGE_GLG_Test helps with this issue.

matlab_crash_dump.7994-1.txt