Open sanchda opened 2 weeks ago
Benchmark execution time: 2024-11-12 16:14:07
Comparing candidate commit dac1014b in PR branch sanchda/add_faultinfo
with baseline commit 873ea858 in branch main
.
Found 0 performance improvements and 0 performance regressions! Performance is the same for 51 metrics, 2 unstable metrics.
Omitted due to size.
Attention: Patch coverage is 54.61538%
with 59 lines
in your changes missing coverage. Please review.
Project coverage is 71.20%. Comparing base (
873ea85
) to head (dac1014
).
This should be synchronized with RFC5.
What does this PR do?
This adds some more siginfo_t context to crashes. In addition to the SIGBUS codes, I added some logic to make sense of things like SIGSYS, since I would like to add support for that soon.
One possible point of contention that I was unable to resolve: currently, we emit a
faulting_address
, which is a representation of thesi_addr
from thesiginfo_t
only when the fault is a segfault. This makes a lot of sense in the current product, which does no special normalization in the telemetry backend. However, from a data transmission perspective, I think it makes more sense to represent the struct-as-it-is-written and defer to the backend for normalization. Moreover,si_addr
has context outside ofSIGSEGV
.This decision duplicates an address for now. Oh well.
Motivation
While exploring the behavior of crashtracking for system tests, and as part of my experiments in other branches, I started relying on
strace
's comprehension forsiginfo_t
. We also had a recent bug that could really only have been comprehended with content fromsiginfo_t
(stack overflow by hitting the crashtracking guard page).I think this information is really important.