ManyThreads / mythos

Many Threads Operating System
https://manythreads.github.io/mythos/
MIT License
16 stars 3 forks source link

I194 deadlock in pagemap #198

Open kubanrob opened 3 years ago

kubanrob commented 3 years ago

Proposal for fixing the deadlock triggered by recursive Cap-deletion while holding CapEntry-locks. Please review thoroughly, this needs some work before being ready to merge and probably still has bugs.

This PR introduces CapEntry::lock_cap() protecting the Capability in the CapEntry, and removes locking from tree traversal. Instead, we use a mechanism similar to double-checked locking. If we notice races, the traversal is restarted from the root.

Please also comment on the documentation (or improve it yourself).

rottaran commented 3 years ago

still fails with deadlock, maybe kernel fault:

1: app [app/init.cc:310] hello thread! ctx=0x0000000000000000
3: Test [app/init.cc:347] Success: res2
1: app [app/init.cc:312] thread in mutex ctx=0x0000000000000000
3: app [app/init.cc:354] sending notifications
    33: v=20 e=0000 i=0 cpl=0 IP=0010:ffffffff8138fed2 pc=ffffffff8138fed2 SP=0018:ffff800100208ff0 env->regs[R_EAX]=0000000000000003
RAX=0000000000000003 RBX=0000000000000000 RCX=ffff800100208e00 RDX=0000000000000000
RSI=0000000000000000 RDI=ffffffff818048d8 RBP=ffff800100208fe0 RSP=ffff800100208ff0
R8 =0000000000000000 R9 =0000000000000000 R10=0000000000000000 R11=0000000000000246
R12=0000000000000000 R13=0000000000000000 R14=0000000000000000 R15=0000000000000000
RIP=ffffffff8138fed2 RFL=00000246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0028 0000000000000000 ffffffff 00aff300 DPL=3 DS   [-WA]
CS =0010 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
SS =0018 0000000000000000 ffffffff 00c09300 DPL=0 DS   [-WA]
DS =0028 0000000000000000 ffffffff 00aff300 DPL=3 DS   [-WA]
FS =0038 0000000001403050 ffffffff 00eff300 DPL=3 DS   [-WA]
GS =0040 0000000000000080 ffffffff 00eff300 DPL=3 DS   [-WA]
LDT=0000 0000000000000000 0000ffff 00008200 DPL=0 LDT
TR =0048 ffffffff814289b8 00002068 0000e900 DPL=3 TSS64-avl
GDT=     ffffffff8142aa20 00000057
IDT=     ffffffff8180c1c0 00000fff
CR0=e0050033 CR2=0000000000000000 CR3=0000000000111000 CR4=00040620
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
CCS=0000000000000044 CCD=0000000000000000 CCO=EFLAGS  
EFER=0000000000000d01
    34: v=20 e=0000 i=0 cpl=0 IP=0010:ffffffff8138fed2 pc=ffffffff8138fed2 SP=0018:ffff800100205ff0 env->regs[R_EAX]=0000000000000003
RAX=0000000000000003 RBX=0000000000000000 RCX=ffff800100205e00 RDX=0000000000000000
RSI=0000000000000000 RDI=ffffffff81804858 RBP=ffff800100205fe0 RSP=ffff800100205ff0
R8 =0000000000000000 R9 =0000000000000000 R10=0000000000000000 R11=0000000000000246
R12=0000000000000000 R13=0000000000000000 R14=0000000000000000 R15=0000000000000000
RIP=ffffffff8138fed2 RFL=00000246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0028 0000000000000000 ffffffff 00aff300 DPL=3 DS   [-WA]
CS =0010 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
SS =0018 0000000000000000 ffffffff 00c09300 DPL=0 DS   [-WA]
DS =0028 0000000000000000 ffffffff 00aff300 DPL=3 DS   [-WA]
FS =0038 0000000001403110 ffffffff 00eff300 DPL=3 DS   [-WA]
GS =0040 0000000000000040 ffffffff 00eff300 DPL=3 DS   [-WA]
LDT=0000 0000000000000000 0000ffff 00008200 DPL=0 LDT
TR =0048 ffffffff814268ec 00002068 0000e900 DPL=3 TSS64-avl
GDT=     ffffffff81428954 00000057
IDT=     ffffffff8180c1c0 00000fff
CR0=e0050033 CR2=0000000000000000 CR3=0000000000111000 CR4=00040620
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
CCS=0000000000000044 CCD=0000000000000000 CCO=EFLAGS  
EFER=0000000000000d01
1: app [app/init.cc:315] thread resumed from wait ctx=0x0000000000000000
2: app [app/init.cc:315] thread resumed from wait ctx=0x0000000000000000
3: app [./runtime/SimpleCapAlloc.hh:81] free p=1025
3: cap [objects/RevokeOperation.cc:109] mythos::optional<void> mythos::RevokeOperation::_delete(mythos::CapEntry*, mythos::Cap) root=0xffff800000104078 rootCap=0xffff800000129240:original:in_transition:zombie:0
3: cap [objects/RevokeOperation.cc:159] bool mythos::RevokeOperation::_findLeaf(mythos::CapEntry*, mythos::Cap, mythos::CapEntry*&, mythos::Cap&) root=0xffff800000104078
3: cap [objects/RevokeOperation.cc:120] _findLockedLeaf returned *leaf=0xffff800000129240:original:in_transition:zombie:0 rootCap=0xffff800000129240:original:in_transition:zombie:0
3: cap [./objects/CapEntry.hh:133] bool mythos::CapEntry::try_lock_cap() this=0xffff800000104078  locked
    35: v=20 e=0000 i=0 cpl=0 IP=0010:ffffffff8138fed2 pc=ffffffff8138fed2 SP=0018:ffff800100208ff0 env->regs[R_EAX]=0000000000000003
RAX=0000000000000003 RBX=0000000000000000 RCX=ffff800100208e00 RDX=0000000000000000
RSI=0000000000000000 RDI=ffffffff818048d8 RBP=ffff800100208fe0 RSP=ffff800100208ff0
R8 =0000000000000000 R9 =0000000000000000 R10=0000000000000000 R11=0000000000000246
R12=0000000000000000 R13=0000000000000000 R14=0000000000000000 R15=0000000000000000
RIP=ffffffff8138fed2 RFL=00000246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0028 0000000000000000 ffffffff 00aff300 DPL=3 DS   [-WA]
CS =0010 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
SS =0018 0000000000000000 ffffffff 00c09300 DPL=0 DS   [-WA]
DS =0028 0000000000000000 ffffffff 00aff300 DPL=3 DS   [-WA]
FS =0038 0000000001403050 ffffffff 00eff300 DPL=3 DS   [-WA]
GS =0040 0000000000000080 ffffffff 00eff300 DPL=3 DS   [-WA]
LDT=0000 0000000000000000 0000ffff 00008200 DPL=0 LDT
TR =0048 ffffffff814289b8 00002068 0000e900 DPL=3 TSS64-avl
GDT=     ffffffff8142aa20 00000057
IDT=     ffffffff8180c1c0 00000fff
CR0=e0050033 CR2=0000000000000000 CR3=0000000000111000 CR4=00040620
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
CCS=0000000000000044 CCD=0000000000000000 CCO=EFLAGS  
EFER=0000000000000d01
2: cap [./objects/CapEntry.hh:133] bool mythos::CapEntry::try_lock_cap() this=0xffff8000001292f8  locked
2: cap [objects/CapEntry.cc:169] mythos::Error mythos::CapEntry::try_lock_prev() this=0xffff8000001292f8 prev=0xffff800000100e08
2: cap [./objects/CapEntry.hh:105] bool mythos::CapEntry::try_lock_next(mythos::CapEntry*) this=0xffff800000100e08  locked
2: cap [./objects/CapEntry.hh:192] bool mythos::CapEntry::try_lock_next() this=0xffff8000001292f8  locked
2: cap [objects/CapEntry.cc:154] mythos::optional<void> mythos::CapEntry::unlinkAndUnlockLinks() this=0xffff8000001292f8
2: cap [objects/CapEntry.cc:158] this unlocks _next of predecessor
2: cap [objects/CapEntry.cc:161] this unlocks _next
2: cap [./objects/CapEntry.hh:133] bool mythos::CapEntry::try_lock_cap() this=0xffff800000412238  locked
2: cap [./objects/CapEntry.hh:150] void mythos::CapEntry::unlock_cap() this=0xffff800000412238
2: cap [objects/RevokeOperation.cc:109] mythos::optional<void> mythos::RevokeOperation::_delete(mythos::CapEntry*, mythos::Cap) root=0xffff800000412238 rootCap=0xffff8000008048c0:usable:original:0
2: cap [objects/RevokeOperation.cc:159] bool mythos::RevokeOperation::_findLeaf(mythos::CapEntry*, mythos::Cap, mythos::CapEntry*&, mythos::Cap&) root=0xffff800000412238
2: cap [objects/RevokeOperation.cc:120] _findLockedLeaf returned *leaf=0xffff8000008048c0:reference:in_transition:zombie:0 rootCap=0xffff8000008048c0:usable:original:0
2: cap [./objects/CapEntry.hh:133] bool mythos::CapEntry::try_lock_cap() this=0xffff800000100e08  locked
2: cap [objects/CapEntry.cc:169] mythos::Error mythos::CapEntry::try_lock_prev() this=0xffff800000100e08 prev=0xffff800000412238
2: cap [./objects/CapEntry.hh:105] bool mythos::CapEntry::try_lock_next(mythos::CapEntry*) this=0xffff800000412238  locked
2: cap [./objects/CapEntry.hh:192] bool mythos::CapEntry::try_lock_next() this=0xffff800000100e08  locked
2: cap [objects/CapEntry.cc:154] mythos::optional<void> mythos::CapEntry::unlinkAndUnlockLinks() this=0xffff800000100e08
2: cap [objects/CapEntry.cc:158] this unlocks _next of predecessor
2: cap [objects/CapEntry.cc:161] this unlocks _next
2: cap [objects/CapEntry.cc:68] void mythos::CapEntry::reset() this=0xffff800000100e08
2: cap [objects/RevokeOperation.cc:159] bool mythos::RevokeOperation::_findLeaf(mythos::CapEntry*, mythos::Cap, mythos::CapEntry*&, mythos::Cap&) root=0xffff800000412238