qemu interrupt logging shows that delivery of an interrupt of vector 0x24 is attempted, but that a page fault trips on the address 0x7f0002c0 - which is 1 byte less than the correct address for vector 0x24:
Note that the IDT for vector 0x24 is computed to be 0x7f008e01000802c0 instead of the correct value of 0x7f008e01000802c1. So let's look at the disassembly of start_interrupts, specifically the portion that builds the table. Note that write_idt is being inlined, which shouldn't be an issue. We see some unusual opcodes (emphasized with ***) that binutils objdump interprets as nops with operands:
Given that SSE instructions are involved, I subsequently added assertions that the stack was 16-byte aligned on function entry and just before table building - and it was.
Granted, this rountine runs without crashing, albeit producing incorrect values. Adding __attribute__((noinline)) to write_idt() produces correct entries and resolves the crash. Neither start_interrupts() nor the stand-alone write_idt() contain the suspect opcodes:
I added a volatile to the lidt asm inline, placed copious memory barriers, etc., but to no avail. The crash occurs the same way whether with hvf acceleration or run-noaccel (TCG).
I can't see the purpose of those "nop"s. Though farfetched, I thought perhaps they were multi-byte nops that enclosed a table of some kind, yet I don't see any reference to those locations in the code.
I haven't isolated or hypothesized any cause, and I still need to single-step through the inlined table build. Opening an issue now to keep track of progress.
Built with:
Apple clang version 11.0.0 (clang-1100.0.33.16)
Target: x86_64-apple-darwin19.2.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
Xcode 11.3
Build version 11C29
While developing in the https://github.com/nanovms/nanos/tree/kernlock branch on macos/clang (commit https://github.com/nanovms/nanos/commit/6ce5efc05f6bd0a96d80a95458a62b7259ae6992), I get the following crash soon after the runloop is started and interrupts are enabled:
qemu interrupt logging shows that delivery of an interrupt of vector 0x24 is attempted, but that a page fault trips on the address 0x7f0002c0 - which is 1 byte less than the correct address for vector 0x24:
Checking out the computed content of the IDT table, we find that some - not all - of the vectors are computed to be 1 less than the correct value:
Note that the IDT for vector 0x24 is computed to be 0x7f008e01000802c0 instead of the correct value of 0x7f008e01000802c1. So let's look at the disassembly of start_interrupts, specifically the portion that builds the table. Note that write_idt is being inlined, which shouldn't be an issue. We see some unusual opcodes (emphasized with ***) that binutils objdump interprets as nops with operands:
Given that SSE instructions are involved, I subsequently added assertions that the stack was 16-byte aligned on function entry and just before table building - and it was.
Granted, this rountine runs without crashing, albeit producing incorrect values. Adding
__attribute__((noinline))
to write_idt() produces correct entries and resolves the crash. Neither start_interrupts() nor the stand-alone write_idt() contain the suspect opcodes:and
I added a volatile to the lidt asm inline, placed copious memory barriers, etc., but to no avail. The crash occurs the same way whether with hvf acceleration or run-noaccel (TCG).
I can't see the purpose of those "nop"s. Though farfetched, I thought perhaps they were multi-byte nops that enclosed a table of some kind, yet I don't see any reference to those locations in the code.
I haven't isolated or hypothesized any cause, and I still need to single-step through the inlined table build. Opening an issue now to keep track of progress.
Built with: