Closed credl closed 8 years ago
I would suggest to run valgrind with the most aggressive debugging options and with default options and see if his brings up some issues.
If not there are more complicated ways to reproduce it I have encountered such things in the past.
When we can reproduce it we can use git bisect to find the source of the issue.
I can have a look with valgrind this week.
Peter On May 15, 2016 8:17 PM, "Christoph Redl" notifications@github.com wrote:
Assigned #25 https://github.com/hexhex/core/issues/25 to @peschue https://github.com/peschue.
— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/hexhex/core/issues/25#event-661016119
Reproducing the bug should not be a problem. It shows up with the current version built in release mode on at least two different machines.
I would appreciate it if you can run valgrind; I used it only for runtime analysis so far.
I made further observations in the meantime, which are probably not relevant but only random effects due to the same memory issue. But for the sake of completeness: the answer set is only wrong if I execute exectly the command which is also run by make check and the working directory is "examples". If I execute it from the top directory of the core (with adopted plugin and input path), or if I change --plugindir=!../testsuite to --plugindir=../testsuite, the bug dissappears. Also adding a verbose option lets it disappear.
Using valgrind fixes the problem. I can reproduce the problem without valgrind. I can reproduce the problem with strace.
On my computer strace with --plugindir=!../testsuite shows significantly fewer lines than with --plugindir=../testsuite ... there seems to be some socket communication going on during the library loading.
Also the number of futex syscalls is fewer with ! (where we can see the bug) and more without ! (where the bug is gone).
According to Christoph we use futex only in the tables for synchronization in case we will use threads later. Currently we do not use threads at all (there is no clone and no fork in the strace).
So it seems that this is not a concurrency issue. Memory access violations are also not detected for this testcase.
When running valgrind on all testcases we get a few errors in one testcaes agg8 only for genuineii:
==26749== Memcheck, a memory error detector ==26749== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al. ==26749== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info ==26749== Command: /home/ps/_science/projects/hex/core/build_release/src/.libs/lt-dlvhex2 -s --plugindir=!../testsuite --nofacts --solver=genuineii --aggregate-enable --aggregate-mode=ext ../../examples/agg8.hex ==26749== ==26749== Invalid read of size 8 ==26749== at 0x50A4AE2: std::_Rb_tree<dlvhex::ID, dlvhex::ID, std::_Identity<dlvhex::ID>, std::less<dlvhex::ID>, std::allocator<dlvhex::ID> >::_M_get_insert_unique_pos(dlvhex::ID const&) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x50A4C96: std::pair<std::_Rb_tree_iterator<dlvhex::ID>, bool> std::_Rb_tree<dlvhex::ID, dlvhex::ID, std::_Identity<dlvhex::ID>, std::less<dlvhex::ID>, std::allocator<dlvhex::ID> >::_M_insert_unique<dlvhex::ID const&>(dlvhex::ID const&) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x5169699: dlvhex::InternalGrounder::biDependency(dlvhex::ID, dlvhex::ID) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x516EB60: dlvhex::InternalGrounder::reorderRuleBody(dlvhex::ID) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x516F3B8: dlvhex::InternalGrounder::groundRule(dlvhex::ID, boost::unordered::unordered_map<dlvhex::ID, dlvhex::ID, boost::hash<dlvhex::ID>, std::equal_to<dlvhex::ID>, std::allocator<std::pair<dlvhex::ID const, dlvhex::ID> > >&, std::vector<dlvhex::ID, std::allocator<dlvhex::ID> >&, Set<dlvhex::ID>&) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0)==26749== by 0x5170191: dlvhex::InternalGrounder::groundStratum(int) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x5171909: dlvhex::InternalGrounder::InternalGrounder(dlvhex::ProgramCtx&, dlvhex::OrdinaryASPProgram const&, dlvhex::InternalGrounder::OptLevel) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x5123094: dlvhex::GenuineGrounder::getInstance(dlvhex::ProgramCtx&, dlvhex::OrdinaryASPProgram const&, boost::shared_ptr<dlvhex::Interpretation const>) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x5071FC1: dlvhex::BaseModelGenerator::computeExtensionOfDomainPredicates(dlvhex::ComponentGraph::ComponentInfo const&, dlvhex::ProgramCtx&, boost::shared_ptr<dlvhex::Interpretation const>, std::vector<dlvhex::ID, std::allocator<dlvhex::ID> >&, std::vector<dlvhex::ID, std::allocator<dlvhex::ID> >&, bool) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x5109D6A: dlvhex::GenuineWellfoundedModelGenerator::generateNextModel() (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x436FC2: dlvhex::OnlineModelBuilder<dlvhex::EvalGraph<dlvhex::FinalEvalUnitPropertyBase, dlvhex::none_t> >::createNextModel(unsigned long) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/lt-dlvhex2) ==26749== by 0x4378EA: dlvhex::OnlineModelBuilder<dlvhex::EvalGraph<dlvhex::FinalEvalUnitPropertyBase, dlvhex::none_t> >::advanceOModelForIModel(unsigned long) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/lt-dlvhex2) ==26749== Address 0xc491060 is 0 bytes after a block of size 16 alloc'd ==26749== at 0x4C2B0E0: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==26749== by 0x512D9DA: dlvhex::BuiltinAtomTable::storeAndGetID(dlvhex::BuiltinAtom const&) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x59037C7: dlvhex::(anonymous namespace)::AggregateRewriter::rewriteRule(dlvhex::ProgramCtx&, boost::shared_ptr<dlvhex::Interpretation>, std::vector<dlvhex::ID, std::allocator<dlvhex::ID> >&, dlvhex::Rule const&) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-internalplugins.so.5.0.0) ==26749== by 0x5904E6C: dlvhex::(anonymous namespace)::AggregateRewriter::prepareRewrittenProgram(boost::shared_ptr<dlvhex::Interpretation>, dlvhex::ProgramCtx&) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-internalplugins.so.5.0.0) ==26749== by 0x58FE7BE: dlvhex::(anonymous namespace)::AggregateRewriter::rewrite(dlvhex::ProgramCtx&) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-internalplugins.so.5.0.0) ==26749== by 0x51E7FE9: dlvhex::RewriteEDBIDBState::rewriteEDBIDB(dlvhex::ProgramCtx*) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x41CEF9: main (in /home/ps/_science/projects/hex/core/build_release/src/.libs/lt-dlvhex2) ==26749== ==26749== Invalid read of size 8 ==26749== at 0x50A4D40: std::pair<std::_Rb_tree_iterator<dlvhex::ID>, bool> std::_Rb_tree<dlvhex::ID, dlvhex::ID, std::_Identity<dlvhex::ID>, std::less<dlvhex::ID>, std::allocator<dlvhex::ID> >::_M_insert_unique<dlvhex::ID const&>(dlvhex::ID const&) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x5169699: dlvhex::InternalGrounder::biDependency(dlvhex::ID, dlvhex::ID) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x516EB60: dlvhex::InternalGrounder::reorderRuleBody(dlvhex::ID) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x516F3B8: dlvhex::InternalGrounder::groundRule(dlvhex::ID, boost::unordered::unordered_map<dlvhex::ID, dlvhex::ID, boost::hash<dlvhex::ID>, std::equal_to<dlvhex::ID>, std::allocator<std::pair<dlvhex::ID const, dlvhex::ID> > >&, std::vector<dlvhex::ID, std::allocator<dlvhex::ID> >&, Set<dlvhex::ID>&) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x5170191: dlvhex::InternalGrounder::groundStratum(int) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x5171909: dlvhex::InternalGrounder::InternalGrounder(dlvhex::ProgramCtx&, dlvhex::OrdinaryASPProgram const&, dlvhex::InternalGrounder::OptLevel) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x5123094: dlvhex::GenuineGrounder::getInstance(dlvhex::ProgramCtx&, dlvhex::OrdinaryASPProgram const&, boost::shared_ptr<dlvhex::Interpretation const>) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x5071FC1: dlvhex::BaseModelGenerator::computeExtensionOfDomainPredicates(dlvhex::ComponentGraph::ComponentInfo const&, dlvhex::ProgramCtx&, boost::shared_ptr<dlvhex::Interpretation const>, std::vector<dlvhex::ID, std::allocator<dlvhex::ID> >&, std::vector<dlvhex::ID, std::allocator<dlvhex::ID> >&, bool) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x5109D6A: dlvhex::GenuineWellfoundedModelGenerator::generateNextModel() (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x436FC2: dlvhex::OnlineModelBuilder<dlvhex::EvalGraph<dlvhex::FinalEvalUnitPropertyBase, dlvhex::none_t> >::createNextModel(unsigned long) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/lt-dlvhex2) ==26749== by 0x4378EA: dlvhex::OnlineModelBuilder<dlvhex::EvalGraph<dlvhex::FinalEvalUnitPropertyBase, dlvhex::none_t> >::advanceOModelForIModel(unsigned long) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/lt-dlvhex2) ==26749== by 0x437E01: dlvhex::OnlineModelBuilder<dlvhex::EvalGraph<dlvhex::FinalEvalUnitPropertyBase, dlvhex::none_t> >::getNextOModel(unsigned long) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/lt-dlvhex2) ==26749== Address 0xc491060 is 0 bytes after a block of size 16 alloc'd ==26749== at 0x4C2B0E0: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==26749== by 0x512D9DA: dlvhex::BuiltinAtomTable::storeAndGetID(dlvhex::BuiltinAtom const&) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x59037C7: dlvhex::(anonymous namespace)::AggregateRewriter::rewriteRule(dlvhex::ProgramCtx&, boost::shared_ptr<dlvhex::Interpretation>, std::vector<dlvhex::ID, std::allocator<dlvhex::ID> >&, dlvhex::Rule const&) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-internalplugins.so.5.0.0) ==26749== by 0x5904E6C: dlvhex::(anonymous namespace)::AggregateRewriter::prepareRewrittenProgram(boost::shared_ptr<dlvhex::Interpretation>, dlvhex::ProgramCtx&) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-internalplugins.so.5.0.0) ==26749== by 0x58FE7BE: dlvhex::(anonymous namespace)::AggregateRewriter::rewrite(dlvhex::ProgramCtx&) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-internalplugins.so.5.0.0) ==26749== by 0x51E7FE9: dlvhex::RewriteEDBIDBState::rewriteEDBIDB(dlvhex::ProgramCtx*) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x41CEF9: main (in /home/ps/_science/projects/hex/core/build_release/src/.libs/lt-dlvhex2) ==26749== ==26749== Invalid read of size 8 ==26749== at 0x50A4CED: std::pair<std::_Rb_tree_iterator<dlvhex::ID>, bool> std::_Rb_tree<dlvhex::ID, dlvhex::ID, std::_Identity<dlvhex::ID>, std::less<dlvhex::ID>, std::allocator<dlvhex::ID> >::_M_insert_unique<dlvhex::ID const&>(dlvhex::ID const&) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x5169699: dlvhex::InternalGrounder::biDependency(dlvhex::ID, dlvhex::ID) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x516EB60: dlvhex::InternalGrounder::reorderRuleBody(dlvhex::ID) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x516F3B8: dlvhex::InternalGrounder::groundRule(dlvhex::ID, boost::unordered::unordered_map<dlvhex::ID, dlvhex::ID, boost::hash<dlvhex::ID>, std::equal_to<dlvhex::ID>, std::allocator<std::pair<dlvhex::ID const, dlvhex::ID> > >&, std::vector<dlvhex::ID, std::allocator<dlvhex::ID> >&, Set<dlvhex::ID>&) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x5170191: dlvhex::InternalGrounder::groundStratum(int) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x5171909: dlvhex::InternalGrounder::InternalGrounder(dlvhex::ProgramCtx&, dlvhex::OrdinaryASPProgram const&, dlvhex::InternalGrounder::OptLevel) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x5123094: dlvhex::GenuineGrounder::getInstance(dlvhex::ProgramCtx&, dlvhex::OrdinaryASPProgram const&, boost::shared_ptr<dlvhex::Interpretation const>) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x5071FC1: dlvhex::BaseModelGenerator::computeExtensionOfDomainPredicates(dlvhex::ComponentGraph::ComponentInfo const&, dlvhex::ProgramCtx&, boost::shared_ptr<dlvhex::Interpretation const>, std::vector<dlvhex::ID, std::allocator<dlvhex::ID> >&, std::vector<dlvhex::ID, std::allocator<dlvhex::ID> >&, bool) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x5109D6A: dlvhex::GenuineWellfoundedModelGenerator::generateNextModel() (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x436FC2: dlvhex::OnlineModelBuilder<dlvhex::EvalGraph<dlvhex::FinalEvalUnitPropertyBase, dlvhex::none_t> >::createNextModel(unsigned long) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/lt-dlvhex2) ==26749== by 0x4378EA: dlvhex::OnlineModelBuilder<dlvhex::EvalGraph<dlvhex::FinalEvalUnitPropertyBase, dlvhex::none_t> >::advanceOModelForIModel(unsigned long) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/lt-dlvhex2) ==26749== by 0x437E01: dlvhex::OnlineModelBuilder<dlvhex::EvalGraph<dlvhex::FinalEvalUnitPropertyBase, dlvhex::none_t> >::getNextOModel(unsigned long) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/lt-dlvhex2) ==26749== Address 0xc491060 is 0 bytes after a block of size 16 alloc'd ==26749== at 0x4C2B0E0: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==26749== by 0x512D9DA: dlvhex::BuiltinAtomTable::storeAndGetID(dlvhex::BuiltinAtom const&) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x59037C7: dlvhex::(anonymous namespace)::AggregateRewriter::rewriteRule(dlvhex::ProgramCtx&, boost::shared_ptr<dlvhex::Interpretation>, std::vector<dlvhex::ID, std::allocator<dlvhex::ID> >&, dlvhex::Rule const&) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-internalplugins.so.5.0.0) ==26749== by 0x5904E6C: dlvhex::(anonymous namespace)::AggregateRewriter::prepareRewrittenProgram(boost::shared_ptr<dlvhex::Interpretation>, dlvhex::ProgramCtx&) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-internalplugins.so.5.0.0) ==26749== by 0x58FE7BE: dlvhex::(anonymous namespace)::AggregateRewriter::rewrite(dlvhex::ProgramCtx&) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-internalplugins.so.5.0.0) ==26749== by 0x51E7FE9: dlvhex::RewriteEDBIDBState::rewriteEDBIDB(dlvhex::ProgramCtx*) (in /home/ps/_science/projects/hex/core/build_release/src/.libs/libdlvhex2-base.so.12.0.0) ==26749== by 0x41CEF9: main (in /home/ps/_science/projects/hex/core/build_release/src/.libs/lt-dlvhex2)
Also there is a different issue in Command: /home/ps/_science/projects/hex/core/build_release/src/.libs/lt-dlvhex2 -s --plugindir=!../testsuite --solver=genuineii ../../examples/maxint.hex
which might be related to boost::spirit (message too long to paste it here).
Another such (possibly boost spirit related) issue is with dlvhex2 -s --plugindir=!../testsuite --solver=genuineii ../../examples/tertop.hex
.
some new ideas after talking to Antonius:
The bug is fixed now. The cause was a loop in ComponentGraph.cpp that iterates over a set inside another loop, while the order of the iterated elements matters. This way, different orders in which the elements in the set were evaluated (changing randomly with e.g. different command line parameters) altered the result. The solution was to move the inner loop outside of the second loop where the order does not matter anymore.
Testcase extatom10.hex fails in version 23fbdac (the most recent one at the time when this issue is added) if built in release mode due to a wrong answer set, which is in turn the consequence of a missing external atom guessing rule. The problem became visible due to a recently introduced configuration option (ProgramCtx, config.setOption("UserInconsistencyAnalysis", 0)).
However, it seems that the recent modifications made the problem only visible but did not cause it. The problem still occurs if all uses of this option are removed, and even if the option name is replaced by another (dummy) string, or if version 1f1de94 (the one before the introduction of the inconsistency techniques) is extended with a dummy option (config.setOption("Bla", 0)). The same problem occurs already with much earlier version if a certain number of pseudo options is added (e.g., in version 7c3d005 one needs to add 3 options). This behavior suggests a memory issue rather than a logical error.
How can be reasonably debug this problem?