Closed andrewbriand closed 4 years ago
With this second segfault, it looks like something in our code caused internal corruption in the MSVC debug iterators over the map. @andrewbriand which map was it again? We figured out it was one of the StringUintMap members
@hu-a I believe it was happening in floatingSpeciesMap
When I commented out the line that loads the values into floatingSpeciesMap, the segfault no longer occured.
@andrewbriand Did we ever try running the debug tests, but without RELOADING_MODEL_MODEL_RECOMPILATION? Maybe it's just the core tests that are causing this problem. If so, we could probably ignore it for now and ship the LLVM 6 release.
We haven't tried that yet, I'll try it tomorrow and keep you updated.
@hu-a When I ran the debug tests without RELOADING_MODEL_MODEL_RECOMPILATION, the test RELOADING_MODEL_NO_MODEL_RECOMPILATION fails with a (presumably the same?) segfault in the LLVMModelDataSymbols dtor. Repeating this process I found that it also occurs in: LOADING_MODEL_MULTIPLE_INSTANCES LOAD_MODEL_FROM_STRING SET_COMPUTE_AND_ASSIGN_CONSERVATION_LAWS FREE_RR_INSTANCE in the TEST_MODEL suite FREE_RR_INSTANCE in the NOM_TESTS suite FREE_RR_INSTANCE in the LIBSTRUCT_TESTS suite SBML_TEST 7 I stopped there and tested it in release mode, all the tests pass in release except GENERATING_MODEL_HASH (issue #527).
We need to make sure that there aren't any segfaults in the tests in release mode. I tried compiling in release mode off of llvm-6 and I got a segfault.
@andrewbriand When you get a chance can you try running the test suite in release mode and see if they pass? I want to see if your environment works since I did rebase the llvm-6 branch and maybe that screwed something up?
@hu-a I ran the tests a few days ago in release mode, and all of them passed except GENERATING_MODEL_HASH. Here was the output: https://gist.github.com/andrewbriand/8698b4a55155c71920626f937a271fcb.
I'm not sure what branch I was on when I ran them but I can check and run them again tomorrow.
EDIT: Come to think of it, it's possible that I forgot to remove the temporary fix in LLVMModelDataSymbols when running the release tests, but like I said I'll check tomorrow.
I ran the tests up to date with llvm-6 in release mode today and all of them passed except GENERATING_MODEL_HASH: https://gist.github.com/andrewbriand/e69967ad49b03dfb76db0f9aea24a1c1
This was fixed with https://github.com/sys-bio/roadrunner/commit/26deb2fce059d5cfe23ba81c42e42575f7a1c3be The only remaining issue is that there didn't seem to be anything wrong with the function in question: throwing if you don't find the index should be perfectly fine. But in this case, it clearly wasn't, and this might indicate a bug in VS and/or LLVM. But it's not an issue for roadrunner any more, so I'm closing this.
Found two segmentation faults when running the test suite in debug mode. One occurs in csr_matrix_new in rrSparse.cpp at lines 118 and 119 if nnz is 0, this seems to be fixed by adding a check to ensure the vectors being accessed have size > 0 like so:
Another segfault occurs when the destructor of LLVMModelDataSymbols is called in core test RELOADING_MODEL_MODEL_RECOMPILATION. The stack trace shows that it occurs in std::_Tree<_Traits> in the function _Orphan_ptr. For some reason de-referencing the pointer _Pnext gives 0xcccccccccccccccc, the visual studio debugger magic number for uninitialized stack memory, when it should be a pointer to an iterator in a linked list of iterators.