markfasheh / duperemove

Tools for deduping file systems
GNU General Public License v2.0
794 stars 78 forks source link

Hyperthreading detection bugs? #260

Closed KhalilSantana closed 3 years ago

KhalilSantana commented 3 years ago

Hello, I've been using duperemove for quite a while now, and I'd like to thank you for your work! Now, to the point:

Current Behaviour

Duperemove ignores 2 out of 4 cores.

Background

When wondering why my dedupe was slow, I noticed that duperemove was only using 2 cores of my CPU rather than all 4, so I thought this might be some load balacing/load target setting, but no such features exist, but I found a likekly culprit, the manual states:

Note: Hyperthreading can adversely affect performance of the extent finding stage. If duperemove detects an Intel CPU with hyperthreading it will use half the number of cores reported by the system for cpu bound tasks.

My PC has an Intel i5-4440 (datasheet) processor, so no hyperthreading there, it's a simple 4C/4T CPU.

So i suspect there's some bug on the hyperthreading detection code/logic.

Expected behavior

Duperemove uses all available resources (CPUs) in order to finish it's task quicker.

Commands run

% sudo duperemove -r -d -h --hashfile=/.rootfs.hashfile /var /usr /opt /srv

Affected version

Oddly enough, pacman and duperemove disagree on what version of the package is in use. I assume this is just a 'maintainer forgot to bump the version' issue:

khalil:~ % pacman -Si duperemove
Repository      : community
Name            : duperemove
Version         : 0.11.2-1
Description     : Btrfs extent deduplication utility
Architecture    : x86_64
URL             : https://github.com/markfasheh/duperemove
Licenses        : GPL
Groups          : None
Provides        : None
Depends On      : glib2  sqlite
Optional Deps   : None
Conflicts With  : None
Replaces        : None
Download Size   : 74.05 KiB
Installed Size  : 243.43 KiB
Packager        : Robin Broda <robin@broda.me>
Build Date      : Mon Dec 7 09:00:01 2020
Validated By    : MD5 Sum  SHA-256 Sum  Signature

khalil:~ % duperemove --version
duperemove v0.12.dev

lscpu -p

khalil:~ % lscpu -p
# The following is the parsable format, which can be fed to other
# programs. Each different item in every column has an unique ID
# starting from zero.
# CPU,Core,Socket,Node,,L1d,L1i,L2,L3
0,0,0,0,,0,0,0,0
1,1,0,0,,1,1,1,0
2,2,0,0,,2,2,2,0
3,3,0,0,,3,3,3,0

Screenshots

lorddoskias commented 3 years ago

Hello can you tell me which version of duperemove exactly are you using? Can you try with latest head, which includes https://github.com/markfasheh/duperemove/commit/c7106b7fc2dac785b89d7d1943e479719ea5e133 . Following this commit it should be using all your physical cores. Also can you provide the output of lscpu -p

KhalilSantana commented 3 years ago

@lorddoskias, I've edited my comment with the requested information, I'll try to build master locally and report back.

KhalilSantana commented 3 years ago

Hello. After testing this for a while, I think duperemove got deadlocked/stuck, and by coencidence it was using only two cores at that momemnt.

Because after that incident, I deleted the hashfile, and restarted the Archlinux-packaged (the same from the original report) again, this time with using the --cpu-threads=$(nproc) flag. and it ran fine. Then I compiled the package from source (HEAD: d4f1ebf) and tested it back and forth, it never repeated that behaviour.

To test a theory, I got the same folders I was deduping and broke the reflinks (using btrfs fi defrag) on the ones I don't have snapshots (/srv in this case), then deleted the hashfile again, and tried both the Arch package and the compiled version, neither displayed the same behaviour again.

So I guess I was too eager to point fingers at the hyperthreading detection logic, but the question remains, what could've cause duperemove to get stuck on the first place?

lorddoskias commented 3 years ago

There was a bug with the v2 hash file aka block-based dedupe where it would constantly hang if a truncated file is scanned for dedup. You can easily verify the number of threads duperemove is using by simply invoking it with the --debug options i.e :

./duperemove --debug
Detected 24 logical and 12 physical cpus (ht is on).

It would show you the number of logical and physical cores detected.

KhalilSantana commented 3 years ago

duperemove ---debug reports the correct number cores/threads.

Detected 4 logical and 4 physical cpus (ht is off).

I'll close this issue as I can't reproduce the bug/deadlock on the original report. If I ever face this again I'll either reopen this issue or create a new one.

Thanks for your assistance!