Open Orycterope opened 5 years ago
bikeshed time:
Related; #102
I'm worried about having a &/&mut in kernel on a memory region that can be modified by another user thread. Should we use UnsafeCell somehow, to prevent compiler optimisations ? (Same reasoning as Mmio
Note that deref_mut() (or get_mut() as roblabla suggests) takes only a const reference to the ProcessMemory.
This is ok since we only check the bookkeeping, and we choose it to be allowed since the ProcessMemory semantically does not track the content of the memory, only the layout of its mappings.
This means that we can have several UserspacePtr mutably derefed at the same time, or a const source and a mutable destination. This is great for most use cases.
However no effort is made to check that they do not overlap, and this might become a problem if we hold several references to the same memory zone.
Actually, get should probably return T and not &T ? This would allow two things:
ldtr
instead of manually querying the page tables.UserSpacePtrMut would obviously have a set
.
So right now, we use UserSpacePtr almost everywhere HOS does. There's one place where we use UserSpacePtr even though we really shouldn't though: The address of the IPC command buffer. The kernel is supposed to remap it in kernelspace in order to access it.
IMO we should probably just reserve two pages of kernel VMEM that IPC command buffers will be remapped to when doing the copying. Then everything else can use UserSpacePtr.
UserSpacePtr should have a get
(and set for UserSpacePtrMut) which does a ptr::read()
(and a ptr::write()
) after ensuring the address is within the process' Address Space. It should not have a Deref, we should never be dereferencing userspace memory. This is especially important since we'll want to enable SMEP and SMAP (prevents execution and access of userspace memory from the kernel), which requires we centralize all access to userspace memory in a single function (which will take care of temporarily disabling SMAP).
So here's the thing: We want a copy_from_user
function whose goal is to safely copy bytes from a user-provided address (pointing to the user address-space) into the kernel address-space. If the user-provided address points to unmapped memory, non-readable memory, kernel-memory, or anything else that's potentially invalid, it should return an error.
On ARM, we have a function that basically does just that, which HOS uses. It drops its privileges by using ldtr
, which will execute an ldr
as if it was ran from EL0. The HOS exception handler checks if the instruction that faulted is within one of the "user to kernel/kernel to user" copy routines, and simply eret
s to the calling function (which is stored in X30, thankfully) with X0=0.
On x86, the situation is a lot more complicated. We have no ldtr
instruction. "Temporarily" dropping privileges is not doable. Furthermore, we don't have access to the parent function easily. So the problem is, how do you prevent faults if the user passes an address that's just unmapped for instance?
Now, we could do something very simple like walking the TLB to figure out if the access is OK before doing it (Asking for permission instead of asking for forgiveness). In the simple, single-processor no preemption world that SunriseOS currently live in, that would work fine, but it'd be a very brittle mechanism. But it's slow, and would probably be a bit more complicated in an SMP-enabled world.
Here's what linux does: It has a special section, called __ex_table
, that contains ranges of kernel addresses and an exception handler associated with that range. When the pagefault handler executes, it first checks if the faulting IP is contained within these ranges, and if it does, it will iret
to the exception handler specified. here's a more in-depth look.
So copy_from_user has basically two asm stubs, the "normal" paths just does the copy, and a separate "failure handling" subroutine that just reports the error. And the pagefault will iret to the failure handling subroutine. This should be performant, especially in the happy path.
I dislike the linux solution, so I went on a quest to see if I could do something better, something that didn't go through the exception handler. I've found something that could work, but relies on extensions that aren't present in most processors, the Intel Transactional Synchronization Extensions (TSX), and in particular the Restricted Transactional Memory (RTM) extension. Those extensions are absolutely not meant for what I'm going to show here, but it should, in theory, work if they are present.
RTM provides two new instructions of interest: XBEGIN
and XEND
. The idea is to make arbitrary memory accesses "atomic" - XBEGIN
has a "failure subroutine" opcode executed when the transaction fails. A transaction can fail for a wide variety of reasons, like if another processor accesses the memory we were writing in our transaction. However, there is one reason that's very interesting to us: A transaction will fail if we attempt accessing unmapped or otherwise unavailable memory. We can effectively bypass the Page Fault handler and handle it entirely in userspace!
A copy_from_user
would look like this:
mov esi, $0 // Move source
mov edi, $1 // Move dest
mov ecx, $2 // Move byte count
xbegin failed
rep movsb
xend
failed: ret
Upon returning from this function, if eax == 0
, the copy from user succeeded.
This is a super elegant solution, but the TSX instructions aren't widely available, so relying on it for such a critical aspect of the OS isn't a great idea. In particular, neither my desktop nor my server have the RTM feature enabled.
So we're back to square one. Still no good solution available. I'm currently swinging towards doing something similar to what linux is doing. In theory, we only need two such "special" functions that can be seen by the fault handler: a copy_from_user and a copy_to_user. Having special asm stub for them, that the fault handler detects, and returns to a special handler somehow should be doable.
We really need a UserspacePtr that checks address can be deref'd.
102
Here's my first attempt at an implementation, please give me your comments:
Jobs
The UserspacePtr is expected to fulfill the following goals:
We also choose to check user permissions. We define the rules as "we do only what the process could have done himself":
This means we'll never output in a read only mapping.
We might want to use a QueryMemory to get the mapping.
Limitations
You cannot make a UserspacePtr pointing to the address space of another process. Our UserspacePtr just derefs
address
, this will only work ifaddress
is in our cr3.To address a value in another address space, you need to create a CrossProcessMapping, and might consider add some additional checks for user accessible and readable-writable
Pseudo code