Now that the timer IRQ handler is calling resched(), any code that is not wrapped in a call to the interrupts::disable_then_restore() function can be interrupted to schedule a new process.
Since we do not support preemption in our kernel yet, we essentially wrap all scheduling code in disable_then_restore(). This prevents our kernel from being interrupted while holding important locks, such as the PIC locks or the process table locks.
However, there is still one remaining issue with deadlocks left. Currently, when processes use the allocator API to create structures like Vec or String, there is a chance that the kernel can interrupt the linked-list-allocator while it is holding the Heap lock.
We need to find a way to disable interrupts before using the allocator API, preferably without vendoring the linked-list-allocator code in this repo.
Possible Solutions
Naive solution
We can wrap every allocator call in a process with disable_then_restore(). This is impractical and poor design.
Use syscall for all memory allocation
I don't know if this will be feasible since the allocator api is invoked automatically by creating a Vec or String. It seems possible but undesirable since Vec and String creation would need to be wrapped to use the syscall.
Wrap the linked-list-allocator
We might be able to wrap the linked-list-allocator with our own allocator that just disables interrupts then calls the allocate/deallocate methods in linked-list-allocator. I think this is our best path forward.
Debugging
Call #4 in the call stack below is where we trigger the deadlock in linked-list-allocator. You can tell it's a deadlock because interrupts stop firing and the same lock check keeps getting executed in a loop.
#4 0x0000000000131ac4 in linked_list_allocator::{{impl}}::dealloc (self=0x40004090,
ptr=0x400022d8 "\000", layout=...)
at /home/rob/.cargo/registry/src/github.com-1ecc6299db9ec823/linked_list_allocator-0.4.3/src/lib.rs:176
Click to expand debug info
```
(gdb) list
1486
1487 #[inline]
1488 unsafe fn atomic_load(dst: *const T, order: Ordering) -> T {
1489 match order {
1490 Acquire => intrinsics::atomic_load_acq(dst),
1491 Relaxed => intrinsics::atomic_load_relaxed(dst),
1492 SeqCst => intrinsics::atomic_load(dst),
1493 Release => panic!("there is no such thing as a release load"),
1494 AcqRel => panic!("there is no such thing as an acquire/release load"),
1495 __Nonexhaustive => panic!("invalid memory ordering"),
(gdb) info stack
#0 core::sync::atomic::atomic_load (dst=0x147018 "\001\000",
order=core::sync::atomic::Ordering::Relaxed)
at /home/rob/.rustup/toolchains/nightly-2017-12-23-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libcore/sync/atomic.rs:1491
#1 0x000000000014581c in core::sync::atomic::AtomicBool::load (
self=0x147018 , order=core::sync::atomic::Ordering::Relaxed)
at /home/rob/.rustup/toolchains/nightly-2017-12-23-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libcore/sync/atomic.rs:316
#2 0x0000000000132d5f in spin::mutex::Mutex::obtain_lock (self=0x147018 )
at /home/rob/.cargo/registry/src/github.com-1ecc6299db9ec823/spin-0.4.6/src/mutex.rs:167
#3 0x0000000000132d95 in spin::mutex::Mutex::lock (self=0x147018 )
at /home/rob/.cargo/registry/src/github.com-1ecc6299db9ec823/spin-0.4.6/src/mutex.rs:191
#4 0x0000000000131ac4 in linked_list_allocator::{{impl}}::dealloc (self=0x40004090,
ptr=0x400022d8 "\000", layout=...)
at /home/rob/.cargo/registry/src/github.com-1ecc6299db9ec823/linked_list_allocator-0.4.3/src/lib.rs:176
#5 0x0000000000129964 in rxinu::__rg_allocator_abi::__rg_dealloc (arg0=0x400022d8 "\000",
arg1=8192, arg2=8) at src/lib.rs:123
#6 0x000000000011456e in alloc::heap::{{impl}}::dealloc (self=0x40001b70, ptr=0x400022d8 "\000",
layout=...)
at /home/rob/.rustup/toolchains/nightly-2017-12-23-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/liballoc/heap.rs:104
#7 0x0000000000111e70 in alloc::raw_vec::RawVec::dealloc_buffer (self=0x40001b70)
at /home/rob/.rustup/toolchains/nightly-2017-12-23-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/liballoc/raw_vec.rs:687
#8 0x0000000000112f05 in alloc::raw_vec::{{impl}}::drop (self=0x40001b70)
at /home/rob/.rustup/toolchains/nightly-2017-12-23-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/liballoc/raw_vec.rs:696
#9 0x0000000000141595 in core::ptr::drop_in_place>
()
at /home/rob/.rustup/toolchains/nightly-2017-12-23-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libcore/ptr.rs:59
#10 0x00000000001411cf in core::ptr::drop_in_place> ()
---Type to continue, or q to quit---
at /home/rob/.rustup/toolchains/nightly-2017-12-23-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libcore/ptr.rs:59
#11 0x0000000000141b44 in core::ptr::drop_in_place>> ()
at /home/rob/.rustup/toolchains/nightly-2017-12-23-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libcore/ptr.rs:59
#12 0x0000000000121095 in rxinu::scheduling::cooperative_scheduler::{{impl}}::kill (
self=0x148bc8 <::deref::__stability::LAZY+8>, id=...) at src/scheduling/cooperative_scheduler.rs:84
#13 0x0000000000119703 in rxinu::scheduling::process::process_ret () at src/scheduling/process.rs:108
#14 0x0000000000148bc8 in ::deref::__stability::LAZY ()
#15 0x00000000001489f0 in ?? ()
#16 0x0000000000000001 in ?? ()
#17 0x0000000000000001 in ?? ()
#18 0x0000000000000000 in ?? ()
```
Problem
Now that the timer IRQ handler is calling
resched()
, any code that is not wrapped in a call to theinterrupts::disable_then_restore()
function can be interrupted to schedule a new process.Since we do not support preemption in our kernel yet, we essentially wrap all scheduling code in
disable_then_restore()
. This prevents our kernel from being interrupted while holding important locks, such as the PIC locks or the process table locks.However, there is still one remaining issue with deadlocks left. Currently, when processes use the allocator API to create structures like
Vec
orString
, there is a chance that the kernel can interrupt thelinked-list-allocator
while it is holding theHeap
lock.We need to find a way to disable interrupts before using the allocator API, preferably without vendoring the
linked-list-allocator
code in this repo.Possible Solutions
Naive solution
We can wrap every allocator call in a process with
disable_then_restore()
. This is impractical and poor design.Use syscall for all memory allocation
I don't know if this will be feasible since the allocator api is invoked automatically by creating a
Vec
orString
. It seems possible but undesirable sinceVec
andString
creation would need to be wrapped to use the syscall.Wrap the
linked-list-allocator
We might be able to wrap the
linked-list-allocator
with our own allocator that just disables interrupts then calls theallocate
/deallocate
methods inlinked-list-allocator
. I think this is our best path forward.Debugging
Call
#4
in the call stack below is where we trigger the deadlock inlinked-list-allocator
. You can tell it's a deadlock because interrupts stop firing and the same lock check keeps getting executed in a loop.Click to expand debug info
``` (gdb) list 1486 1487 #[inline] 1488 unsafe fn atomic_load