Closed spficklin closed 4 years ago
@JohnHadish can you do a functional test to make sure the problem is fixed?
@bentsherman would you be able to do a quick code check to make sure nothing weird stands out to you?
This does not appear to work for me. After I installed the 157_cmx_ccm_sync
branch, I just got a much longer error message. The old version did not have any warning
messages before the end, and did not segmentation fault.
To reproduce error, use my test KINC repo located at RUN_KINC_DEFAULT
on my personal gitlab account.I just gave @spficklin premissions for it. To run my test, just clone the repo, and run ./02-Run_KINC.sh
BEFORE FIX ERROR
99% 2s
warning: cmx and ccm are out of sync
100%
Removing biased edges from the extract network using KINC.R.
AFTER FIX ERROR -- last few lines (it was much longer)
warning: cmx and ccm are out of sync at cmx coordinate ( 773 , 665 ).
warning: cmx and ccm are out of sync at cmx coordinate ( 773 , 680 ).
warning: cmx and ccm are out of sync at cmx coordinate ( 773 , 688 ).
warning: cmx and ccm are out of sync at cmx coordinate ( 773 , 693 ).
warning: cmx and ccm are out of sync at cmx coordinate ( 773 , 701 ).
warning: cmx and ccm are out of sync at cmx coordinate ( 773 , 704 ).
warning: cmx and ccm are out of sync at cmx coordinate ( 773 , 708 ).
warning: cmx and ccm are out of sync at cmx coordinate ( 773 , 710 ).
warning: cmx and ccm are out of sync at cmx coordinate ( 773 , 711 ).
[jah-desktop:09966] *** Process received signal ***
[jah-desktop:09966] Signal: Segmentation fault (11)
[jah-desktop:09966] Signal code: (128)
[jah-desktop:09966] Failing at address: (nil)
[jah-desktop:09966] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x46210)[0x7fe9a28a0210]
[jah-desktop:09966] [ 1] kinc(+0x85694)[0x555dece19694]
[jah-desktop:09966] [ 2] kinc(+0x3907c)[0x555decdcd07c]
[jah-desktop:09966] [ 3] kinc(+0x39f21)[0x555decdcdf21]
[jah-desktop:09966] [ 4] /usr/local/lib/libacecore.so.3(_ZN3Ace8Analytic15AbstractManager11writeResultEOSt10unique_ptrI22EAbstractAnalyticBlockSt14default_deleteIS3_EEi+0x50)[0x7fe9b3146400]
[jah-desktop:09966] [ 5] /usr/local/lib/libacecore.so.3(_ZN3Ace8Analytic6Single11writeResultEOSt10unique_ptrI22EAbstractAnalyticBlockSt14default_deleteIS3_EE+0x16)[0x7fe9b313ec66]
[jah-desktop:09966] [ 6] /usr/local/lib/libacecore.so.3(_ZN3Ace8Analytic13AbstractInput10saveResultEOSt10unique_ptrI22EAbstractAnalyticBlockSt14default_deleteIS3_EE+0x5b)[0x7fe9b31455eb]
[jah-desktop:09966] [ 7] /usr/local/lib/libacecore.so.3(_ZN3Ace8Analytic9SimpleRun7addWorkEOSt10unique_ptrI22EAbstractAnalyticBlockSt14default_deleteIS3_EE+0x31)[0x7fe9b313c911]
[jah-desktop:09966] [ 8] /usr/local/lib/libacecore.so.3(_ZN3Ace8Analytic6Single7processEv+0x41)[0x7fe9b313eaf1]
[jah-desktop:09966] [ 9] /lib/x86_64-linux-gnu/libQt5Core.so.5(+0x2bf5b6)[0x7fe9a30585b6]
[jah-desktop:09966] [10] /lib/x86_64-linux-gnu/libQt5Core.so.5(_ZN7QObject5eventEP6QEvent+0x1d5)[0x7fe9a304bcf5]
[jah-desktop:09966] [11] /usr/local/lib/libacecli.so.3(_ZN12EApplication6notifyEP7QObjectP6QEvent+0x25)[0x7fe9a3300c85]
[jah-desktop:09966] [12] /lib/x86_64-linux-gnu/libQt5Core.so.5(_ZN16QCoreApplication15notifyInternal2EP7QObjectP6QEvent+0x18a)[0x7fe9a301f93a]
[jah-desktop:09966] [13] /lib/x86_64-linux-gnu/libQt5Core.so.5(_ZN14QTimerInfoList14activateTimersEv+0x3d0)[0x7fe9a30768b0]
[jah-desktop:09966] [14] /lib/x86_64-linux-gnu/libQt5Core.so.5(+0x2de1e4)[0x7fe9a30771e4]
[jah-desktop:09966] [15] /lib/x86_64-linux-gnu/libglib-2.0.so.0(g_main_context_dispatch+0x27d)[0x7fe99de6afbd]
[jah-desktop:09966] [16] /lib/x86_64-linux-gnu/libglib-2.0.so.0(+0x52240)[0x7fe99de6b240]
[jah-desktop:09966] [17] /lib/x86_64-linux-gnu/libglib-2.0.so.0(g_main_context_iteration+0x33)[0x7fe99de6b2e3]
[jah-desktop:09966] [18] /lib/x86_64-linux-gnu/libQt5Core.so.5(_ZN20QEventDispatcherGlib13processEventsE6QFlagsIN10QEventLoop17ProcessEventsFlagEE+0x65)[0x7fe9a3077565]
[jah-desktop:09966] [19] /lib/x86_64-linux-gnu/libQt5Core.so.5(_ZN10QEventLoop4execE6QFlagsINS_17ProcessEventsFlagEE+0x12b)[0x7fe9a301e4db]
[jah-desktop:09966] [20] /lib/x86_64-linux-gnu/libQt5Core.so.5(_ZN16QCoreApplication4execEv+0x96)[0x7fe9a3026246]
[jah-desktop:09966] [21] /usr/local/lib/libacecli.so.3(_ZN12EApplication4execEv+0x5ba)[0x7fe9a33023ca]
[jah-desktop:09966] [22] kinc(+0x261df)[0x555decdba1df]
[jah-desktop:09966] [23] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7fe9a28810b3]
[jah-desktop:09966] [24] kinc(+0x26c5e)[0x555decdbac5e]
[jah-desktop:09966] *** End of error message ***
./02-Run_KINC.sh: line 100: 9966 Segmentation fault (core dumped) kinc run extract --emx "${PREFIX}.emx" --ccm "${PREFIX}.paf.ccm" --cmx "${PREFIX}.paf.cmx" --csm "${PREFIX}.csm" --format "tidy" --output "${PREFIX}.paf-th${th}-p${p}-rsqr${r2}.txt" --mincorr $th --maxcorr 1 --filter-pvalue $p --filter-rsquare $r2
@JohnHadish you will have to rerun the cond-test
to fix the out of sync files and then run extract
Yes, I confirm that this is what I did, here are my exact steps, I started with a fresh test directory for this to ensure I was not using old files:
cd KINC/
git pull
git checkout origin/157_cmx_ccm_sync
git pull
git status
sudo make
sudo make install
cd ..
git clone git@gitlab.com:JohnHadish/run_kinc_default.git
cd run_kinc_default
./02-Run_KINC.sh
Okay. It looks like the extract
analytic was having similar issues. So, I applied a fix and tested it with your dataset @JohnHadish and it ran just fine. Can you test again?
Thanks @JohnHadish and @bentsherman for the quick reviews!
This PR fixes issue #157. The problem potentially occurs with the
corrpower
orcond-test
analytics, but can only happen in either case with a sparse matrix. The problem is if the cmx or ccm files are sparse, the processing of pairs is not starting at the first pair. It doesn't create incorrect results, it just offsets the counting in the for loops which causes the ccm to have fewer pairs than the original cmx. The unfortunate side effect, other than the warning message is that a few potential edges may be missing from the output files after anextract
.This code fixes that problem by adding a new function
Matrix::Pair::readFirst()
which allows each analytic to first find the first real pair in a sparse matrix before beginning the first work block. I also adjusted the code a bit to ensure that if there ever was another future problem with pairs missing in one file that moving between pairs in a files can recover in the event of an out of sync problem.