Open Others opened 4 years ago
Reviewer Name: Pat Cody Review Type: Critical
Keeping mixed-criticality systems secure when running on a single multicore device is challenging due to the complexities of inter-core coordination. A frequent problem is interference, when lower assurance code generates too many interrupts and impacts the performance of the high-assurance code. However, running each assurance level as a VM has high overhead costs and makes it impractical.
CHAOS provides a separate runtime environment with the bare necessities (simple bear necessities) to devirtualize high-assurance/high-criticality code, such that it is not tied to the execution of the lower-assurance code. Instead, the two runtimes communicate via proxies. If the code running in ChaosRT (the high-assurance code) needs data available from the low-assurance code, it can make a request using the proxies. Chaos was implemented within NASA's Core Flight System (cFS), and it relies on a handful of key features provided by the Composite OS.
Reviewer: Sam Frey Review Type: Skim
Problem To reduce size and power consumption, many embedded systems requiring multiple execution streams have transitioned away from multiple single-core processors in favor of a single multi-core processor. Having a multi-core processor adds complexity for systems that run software with differing levels of criticality. Interrupts from low criticality software can interfere with high criticality software.
Contributions Chaos lowers interference by removing high criticality software from the primary subsystem and allowing it to run in a minimal ChaosRT environment without interruption. Chaos improved processing latency by a factor of 2.7 compared to a standard Linux real-time environment while also improving isolation of critical code.
Reviewer: Jacob Cannizzaro Type: Comprehend/Skim
Problem:
With the trend of switching from single core processors to single multicore processors, there can be a lot of interference from less critical tasks taking up processing time as well as introduces more contention for shared resources. This adds a lot of overhead when trying to run all of this code, no matter the assurance level, in one place.
Main Contributions:
This paper uses Chaos to devirtualize some tasks. By taking tasks that are highly critical, and running them in a minimal ChaosRT environment, VM's don't have to have as much overhead, and communication can be dealt with with proxies. Communication now flows from lower priority tasks to higher functionality VM using the help of proxies that are abl to bound interference and latency. This reduces the worst case latency of a system by 3.5 that of the Linux equivalent.
Reviewer: Andrew Nguyen Review Type: Skim
Problem There is a need to minimize size and power of embedded systems. Many systems move to a single stream processor in order to achieve this making it difficult to coordinate high & low criticality tasks. The paper introduces Chaos that devirtualizes to obtain high criticality tasks.
Contributions Chaos reduces interference through its devirtualization process like previously mentioned. Tasks are isolated and moved to the ChaosRt environment that enables high-assurance & high-criticality tasks. These are also done in rate-limiting servers. The paper then looks into the design, the scenarios of interference, synchronic communications, and the implementations of it.
Questions
Reviewer: Henry Jaensch Review Type: Skim
Embedded systems are moving toward using one chip for all of the tasks on the system. This paper proposes a way to maintain criticality and high resource efficiency when mixing many components and functions on the same processor. The priority here is to maintain efficient feature rich user applications while also providing isolation guarantees for high criticality processes.
The paper introduces CHAOS a system for de-virtualizing high-criticality systems so that deadlines can be met without interference from other applications with lower-criticality. This is achieved by providing CHAOS RT which is a bare bones runtime that allows predictable execution of tasks. In order to support communication between mixed criticality tasks proxies are used to maintain feature richness.
What's the difference between bounded asynchronous communication and synchronous communication?
Why was Linux one of the choices for comparison here? Mixed assurance and criticality yes, but Linux doesn't make any real-time guarantees.
Reviewer: Marcus Young Review Type: Skim
Problem Modern embedded systems are increasingly using single multi-core processors that are asked to process extremely complex tasks with different criticality levels. Since these systems are using a single multi-core processor instead of multiple single-core processors, there is a need to extract high criticality tasks to run them in a minimal runtime environment in order to improve human or equipment safety.
Contributions Chaos removes high criticality software from the primary subsystem and puts it in a ChaosRT minimal runtime environment. Chaos improved processing latency for a sensor/actuation loop in satellite software experiencing inter-core interference by a factor of 2.7 while reducing worst-case by a factor of 3.5 over a real-time Linux variant.
Reviewer: Rachell Kim Review Type: Skim
Problem Being Solved:
Embedded systems using multi-core processors to support mixed-criticality and multi-assurance levels often face difficulty in enforcing strict isolation between subsystems. Because shared abstractions between cores may trigger interference, it is important to protect high-criticality tasks from faults caused by subsystems of low-assurance and low-criticality. Moreover, systems must also maintain high-confidence in correctness while supporting feature-rich software, and this condition is considered to be difficult to maintain with current technology.
Main Contributions:
The authors of this paper propose a system called Chaos, which aims to remove interference caused by inter-core coordination in multi-core systems via devirtualization of high-criticality tasks. High-criticality tasks are moved into an execution environment called ChaosRT, thereby allowing predictable execution with minimal interference from shared subsystems. This paper also outlines example situations in which shared memory and inter-core coordination may impact the execution of high-criticality code.
Questions:
Reviewer Name: Becky Shanley Review Type: Critical
Embedded systems struggle to manage the balance between minimizing SWaP (Size, Weight, and Power) and providing time guarantees and resources to the highest priority processes. In embedded systems, this problem is much more severe because high priority tasks usually include detrimental impacts on the physical world and the safety of humans. It's a difficult problem to solve because of the requirement that the high priority tasks work with the lower priority tasks to provide many real-time functionalities that cannot afford to be completely separated.
CHAOS is a devirtualization system that is used to guarantee that the high priority tasks have access to the resources and low-latency requirements they need. It achieves this by extracting high priority tasks into CHAOSrt, a real-time environment that is separated from the interference of potentially low assurance level tasks. In all, this paper contributes to the problem domain by:
Reviewer: Sean McBride
Review Type: Simple Skim
How can one consolidate mixed criticality workloads onto shared multi-core systems? Also, how can one leverage the more differentiated QOS attributes of a modern RTOS to provide better assurance guarantees to industry-standard software systems that run mixed criticality subsystems on a shared POSIX backend?
Reviewer Name: Eric Wendt Review Type: Critical
Problem Being Solved: Some of the fundamental problems that need addressing are size, weight, and power for IoT devices. Finding a good balance for these requirements is exceedingly difficult, combining techniques from both hardware and software. This paper dives into software solutions focusing on cutting down interference between highly-critical tasks and lower tasks.
Main Contributions Fortunately, many of the main contributions are laid out in a distinct sub-header in the paper.
Questions:
Critiques:
Sorry for the late post, lost power for a few hours.
Embedded systems are increasingly required to run many different processes at varying degrees of criticality. High-criticality systems need to run at a high priority to protect human or equipment safety, whereas lower-criticality systems are nice to have, but not as important. Due to resource constraints on IoT devices, these processes have to be scheduled by the same processor, and the underlying hardware overhead for deciding which processes run can cause interference with high priority tasks.
The paper introduces ChaosRT, a minimal runtime environment that removes high-criticality tasks from the management system of the VM it normally would run on and thus minimizes or eliminates interference from lower priority tasks. Tasks that are removed from the VM can still communicate with the higher-priority tasks using proxies that handle communication between the devirtualized, higher criticality tasks and the rest of the system.
1) If a devirtualized system required sensor readings or some other data that was gathered in a lower priority task, wouldn't it still have to wait for that task to complete and for the proxy to get the information?
2) Why does devirtualization work so well for this? I'm confused as to how this reduces interference from other tasks.
3) What happens if there is more IPI interference than allowed messages for a certain task?
1) The example of the NASA cFS system they used to explain the problems with virtualization was really helpful, I would've liked to see an example using ChaosRT when they talked about the implementation.
2) What security concerns are there with the proxy? Can it be spoofed to send bad data to a safety critical system?
Reviewer: Huachuan Wang Review Type: Skim
Problem being solved
Embedded systems are increasingly required to provide both complicated feature-sets, and high-confidence in the correctness of mission-critical computations. Functionalities traditionally performed are consolidated onto less expensive and more capable multi-core, commodity processors are very complicated. Supporting feature-rich, general computation and high-confidence physical control is difficult.
Contributions
This paper presented Chaos which could effectively use the increased throughput of multi-core machines and ensuring the necessary isolation between tasks of different criticalities and assurance-levels. Chaos also devirtualizes high criticality tasks to remove the overhead.
paper link
Have fun :)