Open perlun opened 5 years ago
Digging deeper, I think that deleting any thread manifests this bug, which is likely that the dispatcher tries to dispatch a thread which has been deleted (CS:EIP
was exactly the same in the page fault I ran into). So, thread deletion is likely completely broken.
I looked briefly at the thread_unlink_list
source code and couldn't find any obvious issue with it. Will have to dig further into this at some later point.
Update: more details in a follow-up comment below. This is really trivial to reproduce.
This is a bit vague, I am not fully sure of what happens here. But: at certain points, if certain errors occur, we can run into a page fault that looks like this:
I have a clear way to reproduce it and some ideas about why this happens so here goes:
How to reproduce
Apply the diff above and you should get this error right on bootup.
gdb excerpt
Here is a
gdb
excerpt from when this happens:I have a feeling that the
tss_list
gets corrupted at some point (perhaps because of incorrect mutexing). Look at thenext
pointers in this trace:The next step would be to use the
watch tss_list
approach as suggested in #21 to see how the TSS list gets modified. Maybe that would help us identify what the source of the corrupted memory is.