Consolidate threads, tcaps, and rcv end-points

This is a proposal to figure out a sane way to consolidate three separate kernel objects together.

Background

Currently, we have three completely separate kernel objects:

threads
tcaps
rcv end-points

rcv end-points must be associated with a specific thread and tcap. Each thread can be associated with a single rcv end-point, and a tcap can be associated with multiple rcv end-points. Each tcap must be associated with at least one rcv end-point and a thread. Threads can exist completely unrelated to rcv end-points. rcv end-points are related to each other in a tree that depicts the scheduling hierarchy.

@phanikishoreg @ryuxin @hungry-foolish @RobertGiff @lab176

Problems

These are an annoying set of constraints, though they would mostly be hidden by libraries.
They also result in multiple, separate kernel objects, 2 of the 3 requiring kernel memory beyond the capability table slot. Threads and tcaps take a whole page (currently), but suffer from a large amount of internal fragmentation.
The kernel structures are dynamically associated with each other, thus requiring user-level to track those relationships, and some awkward edge-cases (cannot transfer budget into a tcap that hasn't yet been associated with a rcv end-point).
All of these inter-relations between kernel objects lead to complexity and extra LoC.
Though the current design is based on the principle of orthogonality (separate conceptual functionalities should be separated in the API), I believe that the user-exposed API can be reasonable (see the cos_defkernel_api for an example of consolidating these), while significantly simplifying the kernel.

Proposal

If it isn't obvious by now: figure out some what to consolidate some or all of these kernel objects. The expected benefits would be in a lowering system complexity, and reducing the number of system resources, making it easier to manage, maintain, and explain. I'm sure that there are many options here, but I'll spell out a few that might be reasonable.

Option 1: Consolidate `rcv` end-points and `tcaps`

As rcv end-points require a tcap and vice-versa, why not consolidate them together? Make the rcv end-point include the tcap page as well. Since the rcv capability will point to a tcap, activating it can either provide another rcv end-point whose tcap should be used (refcnted), or we can provide kernel memory to allocate the new tcap for it.

Benefits. When referencing a thread capability in the API, it is unambiguous that we are performing a switch, as they are isolated from the consolidation.

Downsides. Still waste memory due to internal fragmentation for threads and tcaps. Still have inter-kernel object references that we need to track (rcv end-point to thread).

Option 2: Consolidate into only `rcv` end-points, and threads

This takes Option 1 further by making the rcv end-points include both the thread and the tcap. Threads can also be created separately. This API looks very similar to what the cos_defkernel_api provides. The kernel API enables operations on rcv end-points and on threads, and consolidates the tcap APIs into the new rcv API. I'd likely think of these as "threads" and either "thread end-points", or "receive threads".

Alternatively, rcv end-points can be kept completely separate, and threads just consolidate with tcaps. It doesn't really simplify any API, as creating a rcv still requires us to pass in a thread capability and another thread for the tcap.

Benefits. rcv end-points are now memory-efficient as they include a single kernel allocation for both the thread and the tcap. The thread capability still has only a single primary function to be performed on it.

Downsides. It is a little awkward to have executable threads referenced through two different types of capabilities. I'm not sure what the edge-cases around this are.

Option 3: Consolidate them all together into threads

Get rid of rcv and tcap kernel resources, and keep only threads. Inline the tcaps into the thread structure. The entire rcv and tcap APIs are integrated into the thread API. rcv end-points that share tcaps will look like threads with dependencies on each other. Scheduling hierarchy rcv relationships also look like dependencies between threads.

Benefits. From 3 to 1 concepts; the simplest option from a kernel object perspective.

Downsides. We're cramming quite a few concepts into the concept of a thread. The API gets a little confusing as everything looks like dependencies between threads. Sometimes simplicity hides complexity in the client-facing API. asnd end-points are hooked into threads, which is somewhat awkward.

An important way to look a this problem is with respect to access rights. For example, for Option 3, if a component has a thread capability, then it would (without some logic added) be able to both call rcv and switch to the thread. It is certainly not the intention that any component that is handling interrupts will also be able to schedule the thread. So if we go in the direction of Option 3, then each of the thread capabilities will also need access permissions -- which actions can be taken on the resource.

I think the first option is the best. The second option allows threads to be created separately, which may introduce two different ways of thread creation, which is a downside. The third option is pretty awkward, as when you create an arcv endpoint, you create a tcap and a thread. If we need multiple arcv endpoints and just one thread, this is pretty strange. Of course, this is my subjective idea.

The option I'm thinking of:

Consolidating kernel objects into one but still having three different capabilities. We'd have:

struct need_a_name { /* like the cos_defkernel_api's struct cos_aep_info */
struct thread {
};
struct rcv {
};
struct tcap {
};
};

Each capability still has a pointer to the corresponding object in this layout. cos_thd_alloc creates thread object and not touch the remaning. cos_tcap_alloc really should not be a system call, instead be a call to allocate capability slot at user-level. (or we can get rid of this call!) cos_arcv_alloc initializes the rcv cap and tcap and their kernel objects that's inline with the thread obj. Note: neither of cos_arcv_alloc nor cos_tcap_alloc will pass in a memory slot for kernel ds. We'd just use the thread (because thread and rcv caps have 1:1 associativity) capability and use the rcv and tcap from that object. (of course we'll have checks to make sure recreation of any of these objects is after deletion, perhaps through a dirty flag in each of these structs.).

Though in many cases we'll have normal threads that don't associate to a rcv or tcap, in such cases these struct rcv or struct tcap will remain unused or reserved. This is very much analogous to current Thread object, where anything beyond the size of struct thread is unused in that PAGE.

Pros:

This still allows for some form of principle of orthogonality except perhaps allocation of tcap and rcv may need to be consolidated.
This also makes efficient use of memory in the kernel. Allocations and deallocations will still be the same, and so will the rest of the system semantics.
This is also faster because in most cases within the kernel, every time we access a rcv object or tcap object, we tend to access very much all three of these associated objects, so it will be much faster if they're in one PAGE and we can do that without additional TLB misses, except for capability object accesses.

Cons:

The most important requirement is that the consolidated object be of size not greater than the granularity of PAGE_SIZE.
Some minor complications with respect to tracking of unused or reserved portion of the kernel object if only thread was created.
Especially with tcap to rcv being 1:m associative, it could be slightly more tedious in terms of initializing rcvcap to know if the provided tcap in rcv_alloc is a new one or an existing tcap.

This mainly focuses on efficient memory usage and performance, definitely not on consolidating the concepts of thread, rcv and tcap. I think should not consolidate the concepts into just a thread capability, for reasons mentioned in downsides of option 3.

This is somewhat like option 3 except the capabilities are not consolidated. We'd do this only if all three objects fit into 1 PAGE of course.

I've been thinking about this a lot recently, but from a different direction: How can we consolidate the control-flow operations of the kernel into the smallest number of orthogonal abstractions, and do we want to do this? I need to write this all down, so that I can focus on other things. As @phanikishoreg pointed out, we might consolidate kernel structures, but not user-level abstractions.

What control-flow APIs do we have currently, and what are their properties?

Thread dispatch on thread capabilities (async activation). This takes the thread capability denoting the thread to switch to, a timeout to optionally program the one-shot timer, a rcv cap that is used to aggregate scheduler events, and coordinate their processing into a sequential thread, a tcap to be used to execute the thread, and a priority with which to program the tcap (allowing a single tcap to be used across different threads. This might cause a protection domain switch depending on the active component in which the thread is executing. The last argument to dispatch is the the synchronization token used to detect the race where the current thread is preempted after making a scheduling decision, then later completes the dispatch call.

A secondary mode for this call is if it is invoked by the scheduler thread (identified by the thread associated with the rcv cap passed in. In this case, this call will only switch threads if the scheduler event queue is flushed.
Synchronous invocation between components. This takes only arguments to be passed to the invoked "server", and resume's this thread's execution at the entry point of the server.
Return from an invoked component. Not strictly performed on a capability (as we use magic capability 0) which resumes the synchronous execution in the client.
Explicit asynchronous activation and time delegation via asnd. Activate the thread associated with the rcv end-point associated with the asnd. Switch immediately to the thread if the tcap has higher utility, or if a yield flag is passed to asnd. Switch to the rcv capability's associated thread, and begin executing using its tcap. Send a scheduler notification to that thread's scheduler rcv capability. If a tcap "delegation" is being performed, also pass the CPU amount to be delegated, and the priority with which to do so.
Invocation of a rcv capability to signify the "end" of an asynchronous activation. This signifies that the current thread does not want to execute any more. However, if there is a pending activation (tracked with an asnd count), this call will return immediately. Otherwise, the scheduler rcv capability's thread is activated, and passed the first scheduling event. The scheduler thread iterates and reads all scheduler events out of the kernel (to be replaced with a shared memory protocol in the future), processing them in turn (i.e. activating or blocking threads). Once a scheduler is done running (idle), it can pass a block flag to rcv to switch to the parent scheduling thread.
Interrupt activation of a hardware asnd. Switch to a thread associated with the rcv hooked to the hardware's interrupt asnd if and only if its associated tcap has a higher utility than the currently active tcap.
Exception execution via synchronous invocations. CPU exceptions and some software-defined exceptions can trigger sinv activation at specific offsets in the captbl. This is identical to sinvs, except that all registers are saved (and restored on return) instead of passing explicit arguments.

(@phanikishoreg Did I miss anything? Did I mis-characterize anything?)

The design space of the system requires:

synchronous invocations for performance,
synchronous exception handling,
asynchronous activations when principals don't trust each other, or for multicore systems, and
asynchronous activations for interrupts.

Since this list is much smaller than the list of control flow abstractions in our system, we should ask if consolidation is possible. This is a slightly different perspective to use when evaluating the question of consolidation (looking at the operations instead of the objects).

We have rightly aggregated synchronous operations in sinvs (Thanks @WenyuanShao!!!), and that API is small, simple, and focused on performance.

Can we aggregate all asynchronous activation operations behind asnd? asnd already takes most of the arguments of thread dispatch with the notable omission being timeout. This makes sense from the perspective that thread dispatch is similar to asnd with the yield flag passed. There are a few large differences: the missing timeout, the missing tcap to activate, and the fact that asnd send scheduling events. We can add the timeout to the asnd API, already have a tcap that is passed in that can be interpreted differently depending on the type of asnd performed, and scheduling events will only be sent if the thread being switched has suspended itself by calling rcv (not the common case in a preemptive system). If we unified thread dispatch, asnd, and interrupts, the asnd call would have the following parameterizations:

timeout - used to program the timer in all cases, and used for tcap_transfer in the delegation case
tcap - used to determine where time is transferred from in the case of a tcap delegation, and used to determine the tcap to switch to in the case of a delegation, or nil in either the case where we simply continue executing with the current tcap, or where we auto-switch to the destination rcv thread's tcap.
priority - used to either delegate time with a specific priority, or set the priority of the tcap to use to execute the destination thread.
asnd capability - used to identify the thread/tcap to switch to.
scheduler synchronization token - used to synchronize scheduler and kernel (see above).
yield flag - used to avoid the tcap preemption decision on switching between threads. If 0, then use tcaps to determine if preemption should be performed.
rcv capability - on which the current thread awaits scheduling events, or nil if it doesn't do so.

The core system control flow operations would use the following configurations:

Dispatch - timeout, tcap as destination tcap, priority as the priority to activate that tcap with, use the scheduler token to synchronize with interrupts, yield = 1, and rcv cap of the scheduler thread to read scheduler events.
tcap delegation - timeout = time to transfer, priority = at which priority, tcap = source tcap to transfer time from, yield set to {0, 1}, and rcv = nil, implicitly: switch the active tcap to that of the destination thread.
asynchronous activation - timeout = nil, priority = nil, tcap = nil, yield set to {0, 1}, and rcv = nil, implicitly: switch active tcap to that of the destination thread.
interrupts - tcap = nil, priority = nil, yield = 0, rcv cap = nil, and implicitly switch to the tcap of the destination thread.

I'm sure that there are some configurations of these different variables that make no sense, and these might cause a combinatoric mess in the code. Otherwise, I wonder if looking at the problem this way could simplify the code. We have one main handler for asnd, and it has a lot of if (flags & ASND_TIMEOUT_DELEG), etc, for each of the different ways to use the parameters. Will this cause code unification and simplification?

If we used asnd capabilities to represent all asynchronous control flow, thd capabilities can take a completely different meaning: the ability the manipulate the thread's state, which is an often higher security operation (e.g. switching register contents). This would allow us to give fault handlers access to thread capabilities, and schedulers access to asnd caps to switch to threads.

If this were the plan, the asnd_activate would take a thd capability instead of a rcv to hook the asnd up to. We can (as @phanikishoreg suggested) unify the thread and tcap structures in the kernel (they are less than a page, combined), which also simplifies the API (though results in some strange warts like asnd_activate(thread1, thread2) to allow separate selection of thread and tcap). However, this is only at the kernel API level, not at the cos_kernel_api or above where we can differentiate the objects.

The big question is if this is all worthwhile, or just a useful exploration to explain the control flow operations of the system to others? Is it worth changing the code? This likely hedges on if it simplifies the system. I don't have an answer to that now.

Some side-effects of doing this that might make kernel code simpler:

asnds are almost just thd caps, but they include two thd caps, the second to address the tcap.
rcvs are just thd capabilities with specific flags set.
current thd caps are now asnds.

Many capabilities simply turn into thd caps, but with flags set according to which operations can be performed on them.

My rumination is this: If we unify the objects, and includes flags for each of the operations that differentiate the current objects, does this focus on operations simplify current code. My feeling is that it likely makes us think about the current code differently, and would make it much more self-documenting (as opposed to what often seem like random operations performed on objects...see rcv).

gwsystems / composite