arbitrary futex pointers

andrewboie commented 4 years ago

Is your enhancement proposal related to a problem? Please describe. Right now, all k_futex objects, even though they can live in user memory, must be known to the kernel at build time.

This is because all k_futex pointers found are placed in the system's kernel object table generated by gperf. When a thread later makes a call like k_futex_wait(), the address of the futex is looked up in the table, and if found, the returned struct z_object has a link to that futex's associated z_futex_data, containing that futex's wait queue.

The current implementation was easy to do; we already had all the infrastructure for kernel objects and it was simple enough to add k_futex to the set of objects and generate its wait queue. But we can do better!

Describe the solution you'd like We'd like user mode to call k_futex_wait() on any user-accessible memory location. This is how it works in Linux. We'd like to be able to do this without requiring heap allocations.

The basic idea is to re-implement k_futex_find_data(). Stop using the kobject table, and instead maintain a special, separate runtime table of active futexes, i.e. futexes that at least one thread is waiting on. This table is keyed by the futex address. When we go virtual memory later, the table will need to be keyed by address plus whatever VM space the address is from.

On a k_futex_wait() syscall:

Validate the caller has access to the memory location like we currently do
Look up the futex address in the table. If present, wait on its wait_q_t.
If not present, create a table entry for it, and then wait, or return -ENOMEM if we ran out of table entries.

On a k_futex_wake() syscall:

Validate the caller has access to the memory location like we currently do
Look up the futex address in the table. If not found, nobody was waiting. Do nothing.
Otherwise, wake one or wake all waiters. If this wakes everyone up, remove from the table.

The nice thing is that the maximum number of table entries needed is a function of the number of active threads in the system; worst-case is a scenario of all threads waiting on different futexes.

We don't have a general-purpose runtime hash table data structure in Zephyr for the table. But we can make a red-black tree, and that should work for lookups. We similarly use one already for dynamic kernel objects.

So I'm thinking we can extend z_futex_data to look something like:

struct z_futex_data {
    /* Queue of pending threads */
    _wait_q_t wait_q;
    /* Protect wait queue */
    struct k_spinlock lock;
    /* Red-black tree node data */
    struct rbnode node;
    /* Address that was waited on */
    void *addr;
}

And instantiate a k_mem_slab pool of these. The number of entries can either be the number of k_threads detected by the kobject mechanism, or a Kconfig. Probably simplest to start with a Kconfig. There would then be a struct rbtree to maintain the table which starts out empty.

This approach should also be applicable to sys_mutex which is essentially a special kind of futex backed with a k_mutex instead of a _wait_q_t.

Describe alternatives you've considered None so far.

wentongwu commented 4 years ago

if not so urgent, I will take this one.

andrewboie commented 4 years ago

@wentongwu I think @jukkar was interested in picking this one up as it's needed for some network stack work he is doing, can you coordinate with him?

jukkar commented 4 years ago

@wentongwu I do not mind you doing this work, if you have time. I was just volunteering for this as I would need this feature in net_mgmt API. I am not familiar with this part of kernel so it might take some time to get things working.

jukkar commented 3 years ago

I do not have time to work on this, so unassigning.

zephyrbot commented 7 months ago

Hi @dcpleung,

This issue, marked as an Enhancement, was opened a while ago and did not get any traction. It was just assigned to you based on the labels. If you don't consider yourself the right person to address this issue, please re-assing it to the right person.

Please take a moment to review if the issue is still relevant to the project. If it is, please provide feedback and direction on how to move forward. If it is not, has already been addressed, is a duplicate, or is no longer relevant, please close it with a short comment explaining the reason.

@andrewboie you are also encouraged to help moving this issue forward by providing additional information and confirming this request/issue is still relevant to you.

Thanks!

zephyrproject-rtos / zephyr

arbitrary futex pointers #25497