jcarbon
is a Java library that provides an application-level view of Intel-Linux system consumption. The view is user-actionable, granting system awareness
to applications.
jcarbon
provides high-granularity data sampled at sub-second periods through a simple interface. jcarbon
provides a collection of physical signals (that come directly from a system component) and virtual signals (which are computed from other signals). Signals are
jcarbon
You can directly use jcarbon.JCarbon
for access to many signals out of the box:
JCarbon jcarbon = new JCarbon();
jcarbon.start();
fib(42);
JCarbonReport report = jcarbon.stop();
List<ProcessEnergy> energy = report.getSignal(ProcessEnergy.class);
System.out.println(
String.format(
"Consumed %.6f joules",
energy.stream()
.mapToDouble(proc -> proc.data().stream().mapToDouble(e -> e.energy).sum())
.sum()));
More information about the physical signals can be found in the linked documentation. Here, we give a short overview of the virtual signals.
process activity
process energy
calmness
process activity
: Jiffies AttributionUnix-based operating systems report cpu cycle usage with a unit called jiffies. A jiffy refers to the time between clock ticks. The jiffies for each executing cpu can be found in /proc/stat
, while those for a process and its children tasks can be found in /proc/<process id>/task/<task id>/stat
. Please refer to https://man7.org/linux/man-pages/man5/proc.5.html for more information regarding the jiffies system.
Since these values represent useful work, we can compute task activity equal to the ratio of the task's jiffies and the jiffies of the task's executing cpu.
$$A{task} = J{task} / J_{cpu} $$
This process is a little tricky due to misalignment in the jiffies updating. Updating of jiffies is done on a specific interval (usually ten milliseconds), but this is done concurrently for each reported component. So while cpus 1
and 2
, and our process's tasks A
and B
may all update every ten milliseconds, when this occurs is likely different. As a result, we may end up with a situation where a task's (or multiple tasks') jiffies exceeds the executing cpu's jiffies.
We need to carefully align and normalize the signals so we don't have more than 100% attributed to the process. The final attribution computation is:
$$A{task} = J{task} / max(1, J{cpu}, \sum{task}{J{task} ,\ where \ cpu{task} = cpu}) $$
process energy
: Energy VirtualizationWe can extend the process activity method described above by combining it with an energy source
, such as rapl
or powercap
. We could reuse the simple computation above, however, there is a corner case. Power systems do not report by logical cpu, but instead by a physical device. In the case of rapl
, this will be a cpu socket, which contains some number of executing dice. Thus, we must do another normalization and aggregate all tasks onto the single physical device. This mapping can be produce from the details of /proc/cpuinfo
through the processer
and physical id
fields. :
$$E{task} = E{socket} * A{task} / max(1, \sum{task}{A{task} ,\ where \ cpu{task} \in socket}) $$
Calmness is a comparative metric that describes similarity between the power state of two logically identical runtime. Given a program, any amount of additional work, such as profiling, will impact the runtime. In performance profiling, we typically focus on overhead or memory footprints to determine an optimal profiling state. Unfortunately, energy characterization is a little more sensitive due to local variance. Therefore, it is critical to find an optimal spot to not disturb the runtime.
To track power state, we can watch cpufreq
as power state is typically correlated with executing frequency. We can then define a spatial and a temporal correspondence by anonymizing the cpus into frequency distributions. We choose Freedman-Diaconis to bucket data. The correspondence can be computed from the fraction of observation per bin.
$$ \mathcal{K} = 2 \frac{IQR(f)}{\sqrt[3]{|cpu| |time|}} $$
$$ f_k = \frac{k}{\mathcal{K}} max(f) + (1 - \frac{k}{\mathcal{K}}) min(f), where \ k \in {0..\mathcal{K}} $$
$$ \mathcal{C}_{T}(t, k) = \frac{|f'|}{|f_t|} , where \ f' \ \in f_k < f_t \ and \ f_t \leq f_k + \frac{k}{\mathcal{K}} $$
$$ \mathcal{C}_{S}(k) = \frac{\sum_t|f'|}{|f|}, where \ f' \ \in f_k \leq f_t \ and \ f_t \leq f_k + \frac{k}{\mathcal{K}} $$
Comparing two traces can be done with any distance metric. Typically pcc is a good choice since it measures covariance.
Reports can be written to disk as a json
. The report has a simple nested record structure:
{
"SignalClass": [
{
"start": 1,
"end": 2,
"data": [
{"id": 0, "value": 2.73},
{"id": 1, "value": 3.14},
],
}
]
}
Building the core jCarbon
artifact from source is done with either bazel
or maven
. Although most of the components are implemented in pure Java, we still support a legacy artifact that enables low-level access to the rapl
subsystem called jRAPL
. While modern implementations typically prefer powercap
, we think it is useful to keep direct access to rapl
available when other solutions are not. However, jRAPL
is implemented in C, so it requires the JNI to be used. Below are build steps to get jRAPL
to work on your system if necessary. In most cases, we recommend relying on powercap
instead.
jRAPL
To use jRAPL
you will need an Intel Linux system.
RAPL
read through is through [msr
s](). You'll need to run sudo modprobe msr
to enable the msr
s.
You need a verion of Java
, the jni
, and maven
or bazel
to build and run the tool. You might be able to deal with all that using the following:
sudo apt install openjdk-11-jdk bazel maven libjna-jni
You can confirm if jRAPL
works on your system by running bash smoke_test.sh
. This should output a short report detailing what jRAPL
is able to find for your system. Example log:
jcarbon (2024-03-18 18:34:23 PM EDT) [main]: warming up...
jcarbon (2024-03-18 18:34:30 PM EDT) [main]: testing rapl...
jcarbon (2024-03-18 18:34:31 PM EDT) [main]: rapl report
- microarchitecture: BROADWELL2
- elapsed time: 1.000430s
- socket: 1, package: 32.429688J, dram: 2.742676J, core: 0.000000J, gpu: 0.000000J
- socket: 2, package: 29.896485J, dram: 1.250000J, core: 0.000000J, gpu: 0.000000J
jcarbon (2024-03-18 18:34:33 PM EDT) [main]: powercap report
- elapsed time: 1.443513s
- socket: 1, package: 32.310403J, dram: 2.609461J
- socket: 2, package: 29.738266J, dram: 1.323297J
jcarbon (2024-03-18 18:34:34 PM EDT) [main]: equivalence report
- elapsed time difference: 0.425827s
- socket: 1, package difference: 0.021444J, dram difference: 0.048101J
- socket: 2, package difference: 0.018143J, dram difference: 0.027690J
jcarbon (2024-03-18 18:34:34 PM EDT) [main]: all smoke tests passed!
A simple UML describing jRAPL
's layout is provided here.
Here are the features we want to try re-integrate into this project: