gwsystems / composite

A component-based OS
composite.seas.gwu.edu
185 stars 70 forks source link

Consolidate threads, tcaps, and rcv end-points #257

Open gparmer opened 7 years ago

gparmer commented 7 years ago

This is a proposal to figure out a sane way to consolidate three separate kernel objects together.

Background

Currently, we have three completely separate kernel objects:

rcv end-points must be associated with a specific thread and tcap. Each thread can be associated with a single rcv end-point, and a tcap can be associated with multiple rcv end-points. Each tcap must be associated with at least one rcv end-point and a thread. Threads can exist completely unrelated to rcv end-points. rcv end-points are related to each other in a tree that depicts the scheduling hierarchy.

@phanikishoreg @ryuxin @hungry-foolish @RobertGiff @lab176

Problems

Proposal

If it isn't obvious by now: figure out some what to consolidate some or all of these kernel objects. The expected benefits would be in a lowering system complexity, and reducing the number of system resources, making it easier to manage, maintain, and explain. I'm sure that there are many options here, but I'll spell out a few that might be reasonable.

Option 1: Consolidate rcv end-points and tcaps

As rcv end-points require a tcap and vice-versa, why not consolidate them together? Make the rcv end-point include the tcap page as well. Since the rcv capability will point to a tcap, activating it can either provide another rcv end-point whose tcap should be used (refcnted), or we can provide kernel memory to allocate the new tcap for it.

Benefits. When referencing a thread capability in the API, it is unambiguous that we are performing a switch, as they are isolated from the consolidation.

Downsides. Still waste memory due to internal fragmentation for threads and tcaps. Still have inter-kernel object references that we need to track (rcv end-point to thread).

Option 2: Consolidate into only rcv end-points, and threads

This takes Option 1 further by making the rcv end-points include both the thread and the tcap. Threads can also be created separately. This API looks very similar to what the cos_defkernel_api provides. The kernel API enables operations on rcv end-points and on threads, and consolidates the tcap APIs into the new rcv API. I'd likely think of these as "threads" and either "thread end-points", or "receive threads".

Alternatively, rcv end-points can be kept completely separate, and threads just consolidate with tcaps. It doesn't really simplify any API, as creating a rcv still requires us to pass in a thread capability and another thread for the tcap.

Benefits. rcv end-points are now memory-efficient as they include a single kernel allocation for both the thread and the tcap. The thread capability still has only a single primary function to be performed on it.

Downsides. It is a little awkward to have executable threads referenced through two different types of capabilities. I'm not sure what the edge-cases around this are.

Option 3: Consolidate them all together into threads

Get rid of rcv and tcap kernel resources, and keep only threads. Inline the tcaps into the thread structure. The entire rcv and tcap APIs are integrated into the thread API. rcv end-points that share tcaps will look like threads with dependencies on each other. Scheduling hierarchy rcv relationships also look like dependencies between threads.

Benefits. From 3 to 1 concepts; the simplest option from a kernel object perspective.

Downsides. We're cramming quite a few concepts into the concept of a thread. The API gets a little confusing as everything looks like dependencies between threads. Sometimes simplicity hides complexity in the client-facing API. asnd end-points are hooked into threads, which is somewhat awkward.

gparmer commented 7 years ago

An important way to look a this problem is with respect to access rights. For example, for Option 3, if a component has a thread capability, then it would (without some logic added) be able to both call rcv and switch to the thread. It is certainly not the intention that any component that is handling interrupts will also be able to schedule the thread. So if we go in the direction of Option 3, then each of the thread capabilities will also need access permissions -- which actions can be taken on the resource.

hungry-foolish commented 7 years ago

I think the first option is the best. The second option allows threads to be created separately, which may introduce two different ways of thread creation, which is a downside. The third option is pretty awkward, as when you create an arcv endpoint, you create a tcap and a thread. If we need multiple arcv endpoints and just one thread, this is pretty strange. Of course, this is my subjective idea.

phanikishoreg commented 7 years ago

The option I'm thinking of:

Consolidating kernel objects into one but still having three different capabilities. We'd have:

struct need_a_name { /* like the cos_defkernel_api's struct cos_aep_info */
struct thread {
};
struct rcv {
};
struct tcap {
};
};

Each capability still has a pointer to the corresponding object in this layout. cos_thd_alloc creates thread object and not touch the remaning. cos_tcap_alloc really should not be a system call, instead be a call to allocate capability slot at user-level. (or we can get rid of this call!) cos_arcv_alloc initializes the rcv cap and tcap and their kernel objects that's inline with the thread obj. Note: neither of cos_arcv_alloc nor cos_tcap_alloc will pass in a memory slot for kernel ds. We'd just use the thread (because thread and rcv caps have 1:1 associativity) capability and use the rcv and tcap from that object. (of course we'll have checks to make sure recreation of any of these objects is after deletion, perhaps through a dirty flag in each of these structs.).

Though in many cases we'll have normal threads that don't associate to a rcv or tcap, in such cases these struct rcv or struct tcap will remain unused or reserved. This is very much analogous to current Thread object, where anything beyond the size of struct thread is unused in that PAGE.

Pros:

Cons:

This mainly focuses on efficient memory usage and performance, definitely not on consolidating the concepts of thread, rcv and tcap. I think should not consolidate the concepts into just a thread capability, for reasons mentioned in downsides of option 3.

This is somewhat like option 3 except the capabilities are not consolidated. We'd do this only if all three objects fit into 1 PAGE of course.

gparmer commented 6 years ago

I've been thinking about this a lot recently, but from a different direction: How can we consolidate the control-flow operations of the kernel into the smallest number of orthogonal abstractions, and do we want to do this? I need to write this all down, so that I can focus on other things. As @phanikishoreg pointed out, we might consolidate kernel structures, but not user-level abstractions.

What control-flow APIs do we have currently, and what are their properties?

(@phanikishoreg Did I miss anything? Did I mis-characterize anything?)

The design space of the system requires:

  1. synchronous invocations for performance,
  2. synchronous exception handling,
  3. asynchronous activations when principals don't trust each other, or for multicore systems, and
  4. asynchronous activations for interrupts.

Since this list is much smaller than the list of control flow abstractions in our system, we should ask if consolidation is possible. This is a slightly different perspective to use when evaluating the question of consolidation (looking at the operations instead of the objects).

We have rightly aggregated synchronous operations in sinvs (Thanks @WenyuanShao!!!), and that API is small, simple, and focused on performance.

Can we aggregate all asynchronous activation operations behind asnd? asnd already takes most of the arguments of thread dispatch with the notable omission being timeout. This makes sense from the perspective that thread dispatch is similar to asnd with the yield flag passed. There are a few large differences: the missing timeout, the missing tcap to activate, and the fact that asnd send scheduling events. We can add the timeout to the asnd API, already have a tcap that is passed in that can be interpreted differently depending on the type of asnd performed, and scheduling events will only be sent if the thread being switched has suspended itself by calling rcv (not the common case in a preemptive system). If we unified thread dispatch, asnd, and interrupts, the asnd call would have the following parameterizations:

The core system control flow operations would use the following configurations:

I'm sure that there are some configurations of these different variables that make no sense, and these might cause a combinatoric mess in the code. Otherwise, I wonder if looking at the problem this way could simplify the code. We have one main handler for asnd, and it has a lot of if (flags & ASND_TIMEOUT_DELEG), etc, for each of the different ways to use the parameters. Will this cause code unification and simplification?

If we used asnd capabilities to represent all asynchronous control flow, thd capabilities can take a completely different meaning: the ability the manipulate the thread's state, which is an often higher security operation (e.g. switching register contents). This would allow us to give fault handlers access to thread capabilities, and schedulers access to asnd caps to switch to threads.

If this were the plan, the asnd_activate would take a thd capability instead of a rcv to hook the asnd up to. We can (as @phanikishoreg suggested) unify the thread and tcap structures in the kernel (they are less than a page, combined), which also simplifies the API (though results in some strange warts like asnd_activate(thread1, thread2) to allow separate selection of thread and tcap). However, this is only at the kernel API level, not at the cos_kernel_api or above where we can differentiate the objects.

The big question is if this is all worthwhile, or just a useful exploration to explain the control flow operations of the system to others? Is it worth changing the code? This likely hedges on if it simplifies the system. I don't have an answer to that now.

gparmer commented 6 years ago

Some side-effects of doing this that might make kernel code simpler:

Many capabilities simply turn into thd caps, but with flags set according to which operations can be performed on them.

My rumination is this: If we unify the objects, and includes flags for each of the operations that differentiate the current objects, does this focus on operations simplify current code. My feeling is that it likely makes us think about the current code differently, and would make it much more self-documenting (as opposed to what often seem like random operations performed on objects...see rcv).