ssrg-vt / popcorn-kernel

Popcorn Linux kernel for distributed thread execution
Other
156 stars 23 forks source link

Inconsistent address space across architectures #32

Closed beowulf closed 6 years ago

beowulf commented 6 years ago

Mohamed to Rob, me, Anthony Hi guys,

As you know, I am encountering a bug when using Popcorn. In this email, I will first describe the bug. Then describe the problem (?). Then the possible solutions.

1) The bug happens when we start a (single-threaded) process on aarch64 and migrate it to x86_64. Then the kernel on x86_64 print the following message: [221740.497345] traps: IS_B[22341] trap stack segment ip:508e30 sp:ffffffbf fdd8 error:0 in IS_B[500000+203000]

2) the "ip" points to the first function to be executed on x86_64. Which means that the stack is not accessible at all. Looking closely at the address of the stack (0xffff ffbf fdd8) shows that this address is illegal on x86_64: the virtual address space on this later is from 0x0 to 0x7fff ffff ffff (at least on our Xeon).

So the problem comes from the difference in the virtual address space on aarch64 compared to x86_64 (with 48 bit of virtual address space[1]).

3) At least two possible solutions: a) user-space: before migrating to x86_64 the "stack-rewriter" should allocate the new stack somewhere between 0x0 and 0x7fff ffff ffff. b) kernel-space: modify the kernel on aarch64 such that the address of the stack is between 0x0 and 0x7fff ffff ffff. Or even better make sure that the virtual address space of the user is between 0x0 and 0x7fff ffff ffff [2]

What do you guys think?

[1] https://software.intel.com/en-us/articles/introduction-to-x64-assembly [2] https://www.kernel.org/doc/Documentation/arm64/memory.txt

Rob Lyerly to Mohamed, me, Anthony Jul 1 Hi all,

In my opinion this is really hard to tackle from user-space mainly because the kernel decides where the stack starts. In user-space a solution approach would be to mmap in new stack space for the transformed (to x86) stack. But this still faces the fundamental problem of getting the kernel to allocate virtual address space within the compatible region; I don't personally know a way to get the kernel to allocate space below a certain virtual address, and attempts to allocate space at specific addresses are brittle (it depends on what other memory the application has allocated, which makes it hard to generalize).

I personally think the best option would be to somehow get the kernel to be aware of cross-node acceptable memory regions. Sang-Hoon, what do you think? … Mohamed Lamine Karaoui to Rob, me, Anthony 9:19 AM Yes, in the kernel it's better and simpler (just a macro to be changed). I tried the following patch and it seems to work fine. What do you think Sang-Hoon?

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 853953c..1d03988 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -25,6 +25,7 @@
 #include <linux/const.h>
 #include <linux/types.h>
 #include <asm/sizes.h>
+#include <asm/page.h>

 /*
  * Allow for constants defined here to be used from assembly code
@@ -56,7 +57,12 @@
 #define PCI_IO_END             (MODULES_VADDR - SZ_2M)
 #define PCI_IO_START           (PCI_IO_END - PCI_IO_SIZE)
 #define FIXADDR_TOP            (PCI_IO_START - SZ_2M)
-#define TASK_SIZE_64           (UL(1) << VA_BITS)
+#ifdef CONFIG_POPCORN
+/* Align with x86 UVAS (arch/x86/include/asm/processor.h) */
+#define TASK_SIZE_64           ((UL(1) << (47))-PAGE_SIZE)
+#else
+#define TASK_SIZE_64           (UL(1) << (VA_BITS))
+#endif

 #ifdef CONFIG_COMPAT
 #define TASK_SIZE_32           UL(0x100000000)

… Anthony Carno to Mohamed, Robert, me 9:40 AM Mohamed,

Are you sure the ThunderX uses 4 levels of translation (48-bit)? I've only ever seen errors at levels 1, 2, and 3 (39-bit, according to that doc). Just checking!

Best, Anthony … Mohamed Lamine Karaoui to Anthony, Rob, me 10:14 AM Not sure how many levels are used. But according to the default kernel configuration we have 48-bit of virtual address space:

$ grep ARM64_VA_BITS /boot/config-4.4.0-98-generic 
# CONFIG_ARM64_VA_BITS_39 is not set
CONFIG_ARM64_VA_BITS_48=y
CONFIG_ARM64_VA_BITS=48

and 4 levels of page table:

$ grep PGTABLE_LEVELS /boot/config-4.4.0-98-generic 
CONFIG_PGTABLE_LEVELS=4

Hopefully, Sang-Hoon will be back today and shed some light on the question.

beowulf commented 6 years ago