nuta / resea

A microkernel-based hackable operating system.
Other
522 stars 29 forks source link

Zero-Copy IPC #19

Open nuta opened 4 years ago

nuta commented 4 years ago

Currently, Resea requires at least two message copies even in the IPC fast path and what's worse, out-of-line payloads (analogous to the pair of buf and len in UNIX's read(2)) involves additional IPC with the pager task and the buffer copy. (Please note that this design decision is for making the microkernel simple as much as possible).

As the kernel allows the pager task to map memory pages, I'm wondering if we could implement a zero-copy IPC using the map system call and the notifications, a asynchronous IPC mechanism like UNIX's signals.

This issue tracks ideas and the progress of this feature.

nuta commented 4 years ago

Requirements

Ideas

arpitvaghela commented 4 years ago

Hey @nuta, I am classmate of @yashrajkakkad and have looked into the issue,

I believe this is the procedure for implementing Zero-Copy IPC

  1. handle flags
  2. prevent src from accessing *m (lock *m)
  3. check if dst is not blocked
  4. map *m to dst
  5. on done notification from dst
  6. unlock *m

However, I am unsure on how locking and unlocking can be performed. Also, How will Done notification be conveyed by dst?

nuta commented 4 years ago

Hi @arpitvaghela. Here's my comments and suggestions:

First, you don't need to consider the compatibility with the existing IPC APIs. It is hard to use zero-copy IPC transparently from the current IPC APIs (e.g. ipc_call(server, &m)) because we need at least one message copy since the message buffer &m tends to be in the stack. Therefore, please try adding new APIs for zero-copy iPC along with the current IPC APIs.

The kernel cannot map a page by itself because it does not know which physical memory pages are available for a page table structures. In other words, the kernel does not have a memory allocator by design. Mapping a page every time a message is sent would degrade the performance.

I suggest mapping pages between tasks only once at the initialization (i.e. create a shared memory) and utilize lock-free algorithms and the notify system call (or another new system call if you need more features).

Because I'm not familiar with such algorithms, your suggestions are very welcome :)

JFYI, memory page mapping is done in vm server. It calls sys_map multiple times because the kernel needs multiple memory pages (kpage) for multi-level page table structures.

arpitvaghela commented 4 years ago

@nuta, can we implement POSIX shared memory ( similar to shm in linux ) ? However, it will require few more support from the kernel.

nuta commented 4 years ago

Yes, we can implement such a feature in vm server.