Why do we need yet another *cov?

whentojump commented 1 month ago

Reason 1: Source based

Example 1

https://elixir.bootlin.com/linux/v6.10-rc7/source/drivers/gpu/drm/drm_buddy.c#L114

GCC gcov report:

        -:  105:static struct drm_buddy_block *
        -:  106:__get_buddy(struct drm_buddy_block *block)
        -:  107:{
        -:  108:    struct drm_buddy_block *parent;
        -:  109:
    #####:  110:    parent = block->parent;
    #####:  111:    if (!parent)
        -:  112:        return NULL;
        -:  113:
   12568*:  114:    if (parent->left == block)
branch  0 never executed (fallthrough)
branch  1 never executed
branch  2 never executed (fallthrough)
branch  3 never executed
branch  4 never executed (fallthrough)
branch  5 never executed
branch  6 taken 9 (fallthrough)
branch  7 taken 1037
branch  8 taken 5226 (fallthrough)
branch  9 taken 6296
    5235*:  115:        return parent->right;
        -:  116:
        -:  117:    return parent->left;
        -:  118:}

Confusing part(s):

Statement coverage: covered lines (L114) can be found after uncovered lines (L110)
Branch coverage: a simple if statement (L114) is reported to have 10 branches instead of 2

Clang Source-based Code Coverage report:

  105|       |static struct drm_buddy_block *
  106|       |__get_buddy(struct drm_buddy_block *block)
  107|  12.6k|{
  108|  12.6k|  struct drm_buddy_block *parent;
  109|       |
  110|  12.6k|  parent = block->parent;
  111|  12.6k|  if (!parent)
  ------------------
  |  Branch (111:6): [True: 0, False: 12.6k]
  ------------------
  112|      0|      return NULL;
  113|       |
  114|  12.6k|  if (parent->left == block)
  ------------------
  |  Branch (114:6): [True: 5.27k, False: 7.37k]
  ------------------
  115|  5.27k|      return parent->right;
  116|       |
  117|  7.37k|  return parent->left;
  118|  12.6k|}

Example 2

https://elixir.bootlin.com/linux/v6.10-rc7/source/drivers/leds/led-triggers.c#L33

GCC gcov report:

        -:   30:static inline bool
        -:   31:trigger_relevant(struct led_classdev *led_cdev, struct led_trigger *trig)
        -:   32:{
       3*:   33:    return !trig->trigger_type || trig->trigger_type == led_cdev->trigger_type;
branch  0 taken 0 (fallthrough)
branch  1 taken 3
branch  2 never executed (fallthrough)
branch  3 never executed
branch  4 never executed (fallthrough)
branch  5 never executed
branch  6 never executed (fallthrough)
branch  7 never executed
branch  8 never executed (fallthrough)
branch  9 never executed
branch 10 never executed (fallthrough)
branch 11 never executed
        -:   34:}

Confusing part(s):

Branch coverage: a logical expression with 2 leaf conditions is reported to have 12 branches instead of 4

Clang Source-based Code Coverage report:

   30|       |static inline bool
   31|       |trigger_relevant(struct led_classdev *led_cdev, struct led_trigger *trig)
   32|      3|{
   33|      3|  return !trig->trigger_type || trig->trigger_type == led_cdev->trigger_type;
  ------------------
  |  Branch (33:9): [True: 3, False: 0]
  |  Branch (33:32): [True: 0, False: 0]
  ------------------
   34|      3|}

Example 3

https://elixir.bootlin.com/linux/v6.10-rc7/source/drivers/firmware/dmi_scan.c#L1068

GCC gcov report:

        5: 1068:    if (s == e || *e != '/' || !month || month > 12) {
branch  0 taken 5 (fallthrough)
branch  1 taken 0
branch  2 taken 5 (fallthrough)
branch  3 taken 0
branch  4 taken 0 (fallthrough)
branch  5 taken 5

Confusing part(s):

Branch coverage: a logical expression with 4 leaf conditions is reported to have 6 branches instead of 8.

Clang Source-based Code Coverage report:

 1068|      5|  if (s == e || *e != '/' || !month || month > 12) {
  ------------------
  |  Branch (1068:6): [True: 0, False: 5]
  |  Branch (1068:16): [True: 0, False: 5]
  |  Branch (1068:29): [True: 0, False: 5]
  |  Branch (1068:39): [True: 0, False: 5]
  ------------------

Example 4

https://elixir.bootlin.com/linux/v6.10-rc7/source/fs/xattr.c#L30

GCC gcov report:

     9120:   33:    while (*a_prefix && *a == *a_prefix) {
condition outcomes covered 4/6
condition  1 not covered (false)
condition  2 not covered (false)

Confusing part(s):

MC/DC: a logical expression with 2 leaf conditions is reported to have 6 condition outcomes instead of 4.

Clang Source-based Code Coverage report:

   33|  1.53k|  while (*a_prefix && *a == *a_prefix) {
  ------------------
  |---> MC/DC Decision Region (33:9) to (33:37)
  |
  |  Number of Conditions: 2
  |     Condition C1 --> (33:9)
  |     Condition C2 --> (33:22)
  |
  |  Executed MC/DC Test Vectors:
  |
  |     C1, C2    Result
  |  1 { F,  -  = F      }
  |  2 { T,  F  = F      }
  |  3 { T,  T  = T      }
  |
  |  C1-Pair: covered: (1,3)
  |  C2-Pair: covered: (2,3)
  |  MC/DC Coverage for Decision: 100.00%
  |
  ------------------

whentojump commented 1 month ago

A post of relevant topic from Rust blog: https://blog.rust-lang.org/inside-rust/2020/11/12/source-based-code-coverage.html

Note: It claims gcov relies on debuginfo to correlate IR to source code. I think this is incorrect:

In user space we can do gcov measurement without -g flag
For kernel we didn't turn CONFIG_DEBUG_INFO on either

syzkaller document on their "covearge": https://github.com/google/syzkaller/blob/master/docs/coverage.md which is also the opposite of "source-based".

whentojump commented 1 month ago

Reason 2: Unique-cause MC/DC

Example 1

https://elixir.bootlin.com/linux/v6.10-rc7/source/net/netlink/genetlink.c#L264

GCC gcov report:

    15898:  264:    if ((flags & GENL_CMD_CAP_DO && !full->doit) ||
condition outcomes covered 8/8
     7949:  265:        (flags & GENL_CMD_CAP_DUMP && !full->dumpit)) {

Clang Source-based Code Coverage report:

  264|  15.8k|  if ((flags & GENL_CMD_CAP_DO && !full->doit) ||
  265|  15.8k|      (flags & GENL_CMD_CAP_DUMP && !full->dumpit)) {
  ------------------
  |---> MC/DC Decision Region (264:6) to (265:50)
  |
  |  Number of Conditions: 4
  |     Condition C1 --> (264:7)
  |     Condition C2 --> (264:34)
  |     Condition C3 --> (265:7)
  |     Condition C4 --> (265:36)
  |
  |  Executed MC/DC Test Vectors:
  |
  |     C1, C2, C3, C4    Result
  |  1 { F,  -,  T,  F  = F      }
  |  2 { T,  F,  F,  -  = F      }
  |  3 { F,  -,  T,  T  = T      }
  |  4 { T,  T,  -,  -  = T      }
  |
  |  C1-Pair: covered: (1,4)
  |  C2-Pair: covered: (2,4)
  |  C3-Pair: not covered
  |  C4-Pair: covered: (1,3)
  |  MC/DC Coverage for Decision: 75.00%
  |
  ------------------

The expression is in the form of (C1 && C2) || (C3 && C4). To cover C3,

To let the second half be evaluated at all, (C1 && C2) must stay false
If C3 is true, in order for it to independently determine the outcome, C4 must also be true
Unique-cause MC/DC additionally requires: C1, C2 and C4 in one test vector must be the same with their values in the other test vector, or we don't care about the value due to short-circuit effects (- in LLVM notation)

Therefore, test vectors 2 and 3 meet the criteria of masking MC/DC, but not unique-cause MC/DC, as C1 is different in two test vectors.

Finding more examples by automation

Limit in our current script in terms of this task:
Connect decisions from GCC reports and from LLVM reports purely by line number, which will miss many pairs due to their different conventions. Some made-up examples:
return (a == 2 && b > 1 && c < 100) // LLVM puts its MC/DC measure here
        ?
        29
        :  // GCC puts its MC/DC measure here
        c;
return a  // LLVM puts its MC/DC measure here
       &&
       b
       || // GCC puts its MC/DC measure here
       c
       ;
For the same reason above we ignore lines with multiple decisions reported by either tool

Affected by all the oddity of GCC optimization as illustrated in "Reason 1: Source based"

Then we inspect all cases where:

GCC is over-reporting, and
the decision has a mixture of && and ||.

With Linux 6.10-rc7 defconfig + KUnit + gcov, GCC produced in total 2583 *.gcda files. We ran our script with 500 of them each time.

1-500:
501- 1000:
1001 - 1500: mm/percpu.c:1184, drivers/ata/libata-core.c:5000
1501 - 2000:
2001 - 2500: net/netlink/genetlink.c:264
2501 - 2853: kernel/time/clockevents.c:68, arch/x86/kernel/apic/apic.c:1600

We found the net/netlink/genetlink.c:264 case ("Example 1") again. Others, after manual inspection, were either due to non-deterministic execution, or due to GCC oddity.

Such examples seem hard to find in kernel coverage reports. It very much depends on execution, and the code-under-test needs to be covered in a very specific way. Nonetheless, with example 1 and many user-space examples (much easier to get), the demonstrated idea is not affected.

chuckwolber commented 1 week ago

These look like great examples. Will they appear in your LPC presentation?

whentojump commented 1 week ago

These look like great examples. Will they appear in your LPC presentation?

Yes! It will be the focus of this one https://lpc.events/event/18/contributions/1895/

xlab-uiuc / linux-mcdc