markfasheh / duperemove

Tools for deduping file systems
GNU General Public License v2.0
805 stars 80 forks source link

`duperemove-0.14` `SIGSEGV`s in `fiemap_scan_extent()` #332

Closed trofi closed 10 months ago

trofi commented 10 months ago

Initially noticed as a duperemove-0.14 crash on a real data set. Reproducing it on a real data takes a while, but using the following duperemove-fuzz.bash usually takes under 20 seconds:

#!/usr/bin/env bash

duperemove_binary=$1
target_dir=$2

shift; shift

if [[ -z $duperemove_binary ]] || [[ -z $target_dir ]]; then
    echo "Usage: $0 </abs/path/to/duperemove> <directory> [duperemove opts]"
    exit 1
fi

# fail on any error
set -e

mkdir "$target_dir"
cd "$target_dir"

shopt -s nullglob

while :; do
    sync
    files=(*)
    f_count=${#files[@]}
    dst=$f_count

    case $((RANDOM % 4)) in
        0)  # copy existing file
            [[ $f_count -eq 0 ]] && continue

            cp_arg=""
            case $((RANDOM % 2)) in
                0) cp_arg=--reflink=always;;
                1) cp_arg=--reflink=never;;
            esac
            src=$((RANDOM % f_count))
            cp -v "$cp_arg" "$src" "$dst"
            ;;
        1) # create new file of 4x4KB distinct blocks
            printf "0%*d" 4095 "$dst"  > "$dst"
            printf "1%*d" 4095 "$dst" >> "$dst"
            printf "2%*d" 4095 "$dst" >> "$dst"
            printf "3%*d" 4095 "$dst" >> "$dst"
            ;;
        2) # run duperemove
            "$duperemove_binary" "$@" -rd -b 4096 "$target_dir"
            ;;
        3) # dd 4KB of one file into another
            [[ $f_count -eq 0 ]] && continue

            src=$((RANDOM % f_count))
            dst=$((RANDOM % f_count))
            [[ $src = $dst ]] && continue

            src_block=$((RANDOM % 3))
            dst_block=$((RANDOM % 3))
            dd "if=$src" "iseek=$src_block" "of=$dst" "oseek=$dst_block" bs=4096 count=1
            ;;
    esac
done

Running:

$ cd /tmp/
$ fallocate -l 10G btrfs.img
$ mkfs.btrfs btrfs.img
$ sudo mount -onoatime,compress=zstd btrfs.img m/
$ sudo chown $(whoami) m/

$ time ./duperemove-fuzz.bash ~/dev/git/duperemove/duperemove $PWD/m/dr -q --batchsize=0 --dedupe-options=same,partial --hashfile=/tmp/hf.db
...
Found 23 identical extents.
[########################################]
Search completed with no errors.
Simple read and compare of file data found 3 instances of extents that might benefit from deduplication.
[0x784bc0] Dedupe for file "/tmp/m/dr/13" had status (-22) "Invalid argument".
[0x77d220] Dedupe for file "/tmp/m/dr/13" had status (-22) "Invalid argument".
[0x77d220] Dedupe for file "/tmp/m/dr/13" had status (-22) "Invalid argument".
./duperemove-fuzz.bash: line 27: 2003552 Segmentation fault      (core dumped) "$duperemove_binary" "$@" -rd -b 4096 "$target_dir"

real    0m1.152s
user    0m0.475s
sys     0m0.590s

gdb says result pointer is NULL there:

$ coredumpctl -r debug
Program terminated with signal SIGSEGV, Segmentation fault.

warning: Section `.reg-xstate/2003586' in core file too small.
#0  0x0000000000409505 in fiemap_scan_extent (extent=extent@entry=0x7f1c7c00f4b0) at filerec.c:343
343             extent->e_poff = result->fe_physical;
[Current thread is 1 (Thread 0x7f1c56ffd6c0 (LWP 2003586))]

(gdb) bt
#0  0x0000000000409505 in fiemap_scan_extent (extent=extent@entry=0x7f1c7c00f4b0) at filerec.c:343
#1  0x000000000040d7ea in extent_dedupe_worker (kern_bytes=0x7f1c56ffcd00,
    fiemap_bytes=<synthetic pointer>, dext=0x7f1c7c031a90) at run_dedupe.c:518
#2  dedupe_worker (priv=0x7f1c7c031a90, counts=0x7fff9893c750) at run_dedupe.c:539
#3  0x00007f1c9441e5ba in g_thread_pool_thread_proxy (data=<optimized out>)
    at ../glib/gthreadpool.c:350
#4  0x00007f1c9441dc6d in g_thread_proxy (data=0x784bc0) at ../glib/gthread.c:831
#5  0x00007f1c94006084 in start_thread (arg=<optimized out>) at pthread_create.c:444
#6  0x00007f1c9408860c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78

(gdb) p *extent
$1 = {e_parent = 0x7f1c7c031a90, e_loff = 4096, e_file = 0x77e6b0, e_list = {next = 0x7f1c7c031b18,
    prev = 0x7f1c7c03fca8}, e_node = {__rb_parent_color = 139760316447929, rb_right = 0x0,
    rb_left = 0x0}, e_poff = 0, e_plen = 0, e_shared_bytes = 0}
(gdb) p *result
Cannot access memory at address 0x0
JackSlateur commented 10 months ago

Hello @trofi

I was able to reproduce:

The ioctl crashes (Invalid argument) but later on, fiemap_scan_extent() is called to check the new extents mapping As the file is truncated, there are no extent

Anyway, I just found a typo, fixed here: https://github.com/markfasheh/duperemove/commit/9912c03c16af67e33b5dc36052e8faed9a17749d

This would fix your issue Thank you!

trofi commented 10 months ago

The change fixes the crash for me both on synthetic and real dataset. Thank you!