Closed mkilgore closed 4 years ago
Thank you, the bug was fixed. I would imagine it would take long time to debug if faced this when I support SMP lol.
I think it's actually easier than you'd expect, besides spinlocks there really aren't that many spots that turn interrupts off (in my case, I suppose, but your kernel looks similar). So once you realize the interrupt state got screwed up there's really not that many places to look.
I only noticed this one because I made the exact same error a while back. Basically, in the spinlock code it's correct to save the value of the EFLAGS before turning interrupts off, but you cannot write that directly to the
m_eflags
member because if the lock is currently taken them_eflags
is holding the preserved eflags value for the current owner. If you overwrite this, when the owner callsrelease()
they will get your interrupt state instead of theirs, potentially screwing things up.Note that it's pretty hard to actually encounter this bug when you don't yet support SMP (like me), since you'll never spin on a lock anyway (because if interrupts are off, it's impossible for any other process to currently be holding the spinlock or try to acquire it). You still can hit this bug, however, if the caller gets an interrupt right after assigning
m_eflags
but before turning interrupts off. In that situation, the newly scheduled process could also assignm_eflags
and overwrite your value before you get to take the lock.