delphix / sdb

The Slick Debugger
Apache License 2.0
57 stars 20 forks source link

DLPX-87531 Fix SDB test regression due to drgn API change #339

Closed sdimitro closed 1 year ago

sdimitro commented 1 year ago

There were a few changes recently in drgn involving data structures used for holding pointers to type information data. These changes introduced subtle differences in the APIs behavior, one of which affected our test suite.

Before when a type was looked up by name and there was no externed/global type with that name but multiple local (source file-specific ones), then one of those local ones was returned consistently. With the recent drgn changes, the behavior is that one of those local types will be randomly picked making one of our regression tests fail sometimes while other times it passed.

The specific test is ptype 'struct v' which we expect to output:

struct v {
    uint8_t b[32];
}

..sometimes returns:

struct v {
    uint8_t b[64];
}

The reason is that there are multiple definitions of this struct in the vdev_raidz code of ZFS:

~/repos/zfs$ git grep -B 5 -A 2 "struct v "
...snip...
module/zfs/vdev_raidz_math_avx2.c-
module/zfs/vdev_raidz_math_avx2.c-extern const uint8_t gf_clmul_mod_lt[4*256][16];
module/zfs/vdev_raidz_math_avx2.c-
module/zfs/vdev_raidz_math_avx2.c-#define       ELEM_SIZE 32
module/zfs/vdev_raidz_math_avx2.c-
module/zfs/vdev_raidz_math_avx2.c:typedef struct v {
module/zfs/vdev_raidz_math_avx2.c-      uint8_t b[ELEM_SIZE] __attribute__((aligned(ELEM_SIZE)));
module/zfs/vdev_raidz_math_avx2.c-} v_t;
--
module/zfs/vdev_raidz_math_avx512bw.c-
module/zfs/vdev_raidz_math_avx512bw.c-extern const uint8_t gf_clmul_mod_lt[4*256][16];
module/zfs/vdev_raidz_math_avx512bw.c-
module/zfs/vdev_raidz_math_avx512bw.c-#define   ELEM_SIZE 64
module/zfs/vdev_raidz_math_avx512bw.c-
module/zfs/vdev_raidz_math_avx512bw.c:typedef struct v {
module/zfs/vdev_raidz_math_avx512bw.c-  uint8_t b[ELEM_SIZE] __attribute__((aligned(ELEM_SIZE)));
module/zfs/vdev_raidz_math_avx512bw.c-} v_t;
...snip...

This patch gets rid of this regression test as it causes occassional failures AND it's not really needed as the functionality that it tests is already covered by other tests.

A side-improvement of this patch is also printing in the Github test result output the differences between what we expect the output to be vs what we get from our reference crash dumps.