Closed beowulf closed 6 years ago
Page table translation level is automatically determined by the number of bits of address space, so we don't need to worry about whether it is 3-level or 4-level for now.
Mohamed's patch seems reasonable in that it forces the stack to be started at the same address of x86. However, the address space is still in 48-bits, and I am not sure whether there is a code that might be sensitive to this situation or not. Anyway, I've tested Mohamed's patch with toy applications, and they look fine for now. Thanks Mohamed
Mohamed to Rob, me, Anthony Hi guys,
As you know, I am encountering a bug when using Popcorn. In this email, I will first describe the bug. Then describe the problem (?). Then the possible solutions.
1) The bug happens when we start a (single-threaded) process on aarch64 and migrate it to x86_64. Then the kernel on x86_64 print the following message: [221740.497345] traps: IS_B[22341] trap stack segment ip:508e30 sp:ffffffbf fdd8 error:0 in IS_B[500000+203000]
2) the "ip" points to the first function to be executed on x86_64. Which means that the stack is not accessible at all. Looking closely at the address of the stack (0xffff ffbf fdd8) shows that this address is illegal on x86_64: the virtual address space on this later is from 0x0 to 0x7fff ffff ffff (at least on our Xeon).
So the problem comes from the difference in the virtual address space on aarch64 compared to x86_64 (with 48 bit of virtual address space[1]).
3) At least two possible solutions: a) user-space: before migrating to x86_64 the "stack-rewriter" should allocate the new stack somewhere between 0x0 and 0x7fff ffff ffff. b) kernel-space: modify the kernel on aarch64 such that the address of the stack is between 0x0 and 0x7fff ffff ffff. Or even better make sure that the virtual address space of the user is between 0x0 and 0x7fff ffff ffff [2]
What do you guys think?
[1] https://software.intel.com/en-us/articles/introduction-to-x64-assembly [2] https://www.kernel.org/doc/Documentation/arm64/memory.txt
Rob Lyerly to Mohamed, me, Anthony Jul 1 Hi all,
In my opinion this is really hard to tackle from user-space mainly because the kernel decides where the stack starts. In user-space a solution approach would be to mmap in new stack space for the transformed (to x86) stack. But this still faces the fundamental problem of getting the kernel to allocate virtual address space within the compatible region; I don't personally know a way to get the kernel to allocate space below a certain virtual address, and attempts to allocate space at specific addresses are brittle (it depends on what other memory the application has allocated, which makes it hard to generalize).
I personally think the best option would be to somehow get the kernel to be aware of cross-node acceptable memory regions. Sang-Hoon, what do you think? … Mohamed Lamine Karaoui to Rob, me, Anthony 9:19 AM Yes, in the kernel it's better and simpler (just a macro to be changed). I tried the following patch and it seems to work fine. What do you think Sang-Hoon?
… Anthony Carno to Mohamed, Robert, me 9:40 AM Mohamed,
Are you sure the ThunderX uses 4 levels of translation (48-bit)? I've only ever seen errors at levels 1, 2, and 3 (39-bit, according to that doc). Just checking!
Best, Anthony … Mohamed Lamine Karaoui to Anthony, Rob, me 10:14 AM Not sure how many levels are used. But according to the default kernel configuration we have 48-bit of virtual address space:
and 4 levels of page table:
Hopefully, Sang-Hoon will be back today and shed some light on the question.