Closed elliottslaughter closed 7 months ago
It's harder to find crashes now, but I hit this one:
Assertion failed: (pending_equivalence_sets == NULL), function finalize_manager, file legion_analysis.cc, line 23931.
This is the Legion branch that has both fixinvalidation and fixvirtualinit merged together.
fixinvalidation
fixvirtualinit
Fuzzer at version https://github.com/StanfordLegion/fuzzer/commit/3057d03ee7e7284337ca22515bef9ba02ce98f45
Command line:
./fuzzer -fuzz:seed 367 -fuzz:ops 16 -fuzz:skip 3 -level 4
The higher seed number means it took me longer to find it. 😃
Backtrace:
* frame #4: 0x0000000101e7c690 liblegion.1.dylib`Legion::Internal::VersionManager::finalize_manager(this=0x00007fa3abf18b60) at reservation.inl:0:15 [opt] frame #5: 0x00000001022ba7da liblegion.1.dylib`Legion::Internal::RegionNode::notify_local(this=0x00007fa3f1019800) at region_tree.cc:16927:44 [opt] frame #6: 0x0000000101da0ee0 liblegion.1.dylib`Legion::Internal::DistributedCollectable::perform_downgrade(this=0x00007fa3f1019800, gc=0x0000700002134680) at garbage_collection.cc:893:7 [opt] frame #7: 0x0000000101d9df28 liblegion.1.dylib`Legion::Internal::DistributedCollectable::remove_gc_reference(this=0x00007fa3f1019800, cnt=1) at garbage_collection.cc:153:16 [opt] frame #8: 0x00000001022bf83c liblegion.1.dylib`Legion::Internal::PartitionNode::notify_local() [inlined] Legion::Internal::DistributedCollectable::remove_nested_gc_ref(this=<unavailable>, source=<unavailable>, cnt=1) at garbage_collection.h:652:14 [opt] frame #9: 0x00000001022bf80c liblegion.1.dylib`Legion::Internal::PartitionNode::notify_local(this=<unavailable>) at region_tree.cc:18015:21 [opt] frame #10: 0x0000000101da0ee0 liblegion.1.dylib`Legion::Internal::DistributedCollectable::perform_downgrade(this=0x00007fa3fd038400, gc=0x0000700002134720) at garbage_collection.cc:893:7 [opt] frame #11: 0x0000000101d9df28 liblegion.1.dylib`Legion::Internal::DistributedCollectable::remove_gc_reference(this=0x00007fa3fd038400, cnt=1) at garbage_collection.cc:153:16 [opt] frame #12: 0x000000010229764f liblegion.1.dylib`Legion::Internal::PartitionTracker::remove_partition_reference() [inlined] Legion::Internal::DistributedCollectable::remove_base_gc_ref(this=0x00007fa3fd038400, source=REGION_TREE_REF, cnt=1) at garbage_collection.h:623:14 [opt] frame #13: 0x0000000102297621 liblegion.1.dylib`Legion::Internal::PartitionTracker::remove_partition_reference(this=<unavailable>) at region_tree.cc:17937:33 [opt] frame #14: 0x00000001022ba6c2 liblegion.1.dylib`Legion::Internal::RegionNode::notify_local(this=0x00007fa3fd020600) at region_tree.cc:16921:22 [opt] frame #15: 0x0000000101da0ee0 liblegion.1.dylib`Legion::Internal::DistributedCollectable::perform_downgrade(this=0x00007fa3fd020600, gc=0x0000700002134810) at garbage_collection.cc:893:7 [opt] frame #16: 0x0000000101d9df28 liblegion.1.dylib`Legion::Internal::DistributedCollectable::remove_gc_reference(this=0x00007fa3fd020600, cnt=1) at garbage_collection.cc:153:16 [opt] frame #17: 0x000000010226c6ff liblegion.1.dylib`Legion::Internal::RegionTreeForest::destroy_logical_region(this=<unavailable>, handle=LogicalRegion @ 0x00007000021348a0, applied=size=0, mapping=<unavailable>) at region_tree.cc:0 [opt] frame #18: 0x0000000102023d8a liblegion.1.dylib`Legion::Internal::DeletionOp::trigger_complete(this=0x00007fa3f430d900) at legion_ops.cc:10942:30 [opt] frame #19: 0x0000000101ff4966 liblegion.1.dylib`Legion::Internal::Operation::complete_execution(this=0x00007fa3f430d900, wait_on=RtEvent @ 0x0000700002134938) at legion_ops.cc:0 [opt] frame #20: 0x0000000102023c1c liblegion.1.dylib`Legion::Internal::DeletionOp::trigger_mapping(this=0x00007fa3f430d900) at legion_ops.cc:0 [opt] frame #21: 0x000000010237fa44 liblegion.1.dylib`Legion::Internal::Runtime::legion_runtime_task(args=0x00007fa3fbf14be8, arglen=<unavailable>, userdata=<unavailable>, userlen=<unavailable>, p=<unavailable>) at runtime.cc:32345:31 [opt] frame #22: 0x00000001036df2f9 librealm.1.dylib`Realm::LocalTaskProcessor::execute_task(this=<unavailable>, func_id=4, task_args=0x0000700002134ca8) at proc_impl.cc:1176:5 [opt] frame #23: 0x000000010371cc80 librealm.1.dylib`Realm::Task::execute_on_processor(this=0x00007fa3fbf14ab0, p=(id = 2089670227099910144)) at tasks.cc:326:40 [opt] frame #24: 0x0000000103722f16 librealm.1.dylib`Realm::KernelThreadTaskScheduler::execute_task(this=<unavailable>, task=<unavailable>) at tasks.cc:1421:11 [opt] frame #25: 0x0000000103721563 librealm.1.dylib`Realm::ThreadedTaskScheduler::scheduler_loop(this=0x00007fa3fc878010) at tasks.cc:1160:6 [opt] frame #26: 0x0000000103724ffe librealm.1.dylib`void Realm::Thread::thread_entry_wrapper<Realm::ThreadedTaskScheduler, &Realm::ThreadedTaskScheduler::scheduler_loop_wlock()>(void*) [inlined] Realm::ThreadedTaskScheduler::scheduler_loop_wlock(this=0x00007fa3fc878010) at tasks.cc:1272:5 [opt] frame #27: 0x0000000103724fea librealm.1.dylib`void Realm::Thread::thread_entry_wrapper<Realm::ThreadedTaskScheduler, &Realm::ThreadedTaskScheduler::scheduler_loop_wlock()>(obj=0x00007fa3fc878010) at threads.inl:97:5 [opt] frame #28: 0x0000000103728cce librealm.1.dylib`Realm::KernelThread::pthread_entry(data=0x00006000037b8160) at threads.cc:831:5 [opt]
I'm just going to keep pushing commits to this branch while the CI is slow: https://gitlab.com/StanfordLegion/legion/-/commit/d2166b12a92910dda9c5b339b78425fdfa0301e6
Yes, this is resolved.
It's harder to find crashes now, but I hit this one:
This is the Legion branch that has both
fixinvalidation
andfixvirtualinit
merged together.Fuzzer at version https://github.com/StanfordLegion/fuzzer/commit/3057d03ee7e7284337ca22515bef9ba02ce98f45
Command line:
The higher seed number means it took me longer to find it. 😃
Backtrace: