rolffokkens / bdsync

a fast block device synchronizing tool
GNU General Public License v2.0
109 stars 26 forks source link

Fix broken ASM exposed by LTO #25

Closed error10 closed 4 years ago

error10 commented 4 years ago

I'm maintaining the Fedora package for bdsync and the following bug report and patch was submitted to me:


I'm working on the F33 feature which turns on link time optimization by default (LTO) for most package builds. In simplest terms LTO defers analysis, optimization and code generation until link time.

The deeper analysis sometimes exposes latent package bugs and that has happened with bdsync.

Specifically bdsync will fail its selftests and thus the build fails.

Ultimately the problem is an incorrect asm which implements checkzero.

The asm looks like this:

int checkzero(void *p, int len)
{
    int is_zero;
    __asm__ (
        "cld\n"
        "xorb %%al, %%al\n"
        "repz scasb\n"
        : "=c" (is_zero)
        : "c" (len), "D" (p)
        : "eax", "cc"
    );
    return !is_zero;
}

The problem is this asm reads memory, but does not express that in the asm. As a result GCC's data dependency graph is incomplete.

LTO enables inlining the call to check_zero from gen_hashes:

   while (nstep) {
        flush_wr_queue (pwr_queue, 0);
        fill_rd_queue (ctx, prd_queue, handler);

        print_progress (ctx, progress, start);

        nrd = vpread (devp, fbuf, step, start);

/* Kills performance; really slow syscall:
        posix_fadvise64 (fd, start + step, RDAHEAD, POSIX_FADV_WILLNEED);
*/
        if (zh && checkzero (fbuf, step)) {
            memcpy (buf, zh->hash, zh->hashsize);
        [ ... ]

Now because the asm does not indicate it reads from fbuf the compiler is free to move the asm before* the call to vpread. And once that happens, the code doesn't work in the expected manner.

The fix is simple, note that the asm reads memory. The standard way to do this looks like this:

diff -Nrup a/checkzero.c b/checkzero.c
--- a/checkzero.c       2019-04-28 09:54:55.000000000 -0600
+++ b/checkzero.c       2020-05-20 13:48:46.469043571 -0600
@@ -14,7 +14,7 @@ int checkzero(void *p, int len)
         "xorb %%al, %%al\n"
         "repz scasb\n"
         : "=c" (is_zero)
-        : "c" (len), "D" (p)
+        : "c" (len), "D" (p), "m" (*(const char (*)[]) p)
         : "eax", "cc"
     );
     return !is_zero;

With proper dataflow for the asm, the compiler knows that the asm for checkzero can't move before the call to vpread and the code works correctly.

If you could fix this upstream and pull the fix into Fedora rawhide it would be greatly appreciated.

rolffokkens commented 4 years ago

Thanks for reporting and fixing this!

rolffokkens commented 4 years ago

Created a new release https://github.com/rolffokkens/bdsync/releases/tag/v0.11.2