High CPU-usage while hardlinking files

hans-helmut commented 2 years ago

Hello,

while fclones was one of the rare duplicate-finders that managed to find all duplicates in a reasonable amount of time (below 2 days) in my backup, hardlinking is very slow.

# fclones --version
fclones 0.25.0

top -H -c shows 3 threads with high CPU-usage:

# top -H -c

top - 15:43:21 up 5 days, 30 min,  5 users,  load average: 6,45, 7,32, 7,28
Threads: 346 total,   5 running, 341 sleeping,   0 stopped,   0 zombie
%CPU(s): 32,8 us, 43,6 sy,  0,0 ni,  0,0 id, 23,5 wa,  0,0 hi,  0,1 si,  0,0 st
MiB Spch:  32055,4 total,    372,4 free,   6103,3 used,  25579,7 buff/cache
MiB Swap:  10240,0 total,   5483,6 free,   4756,4 used.  25498,2 avail Spch

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     ZEIT+ BEFEHL                                                                                                                                                                   
  39209 root      20   0  346956  24516   4856 R  99,9   0,1 102:39.10 fclones link                                                                                                                                                                 
  39210 root      20   0  346956  24516   4856 R  99,9   0,1 106:14.93 fclones link                                                                                                                                                                 
  39211 root      20   0  346956  24516   4856 R  99,9   0,1  95:29.37 fclones link      
[...]

Connecting strace -p <PID>to the threads shows that one thread is calling

statx(AT_FDCWD, "/filename/...", AT_STATX_SYNC_AS_STAT, STATX_ALL, {stx_mask=STATX_ALL|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0600, stx_size=47320, ...}) = 0

for different versions of a file, while all other threads are calling

sched_yield()                           = 0
sched_yield()                           = 0
sched_yield()                           = 0

in a loop. So there seems to be some active waiting, while deleting files

Environment

PC has an old 4-Core CPU
Disk is a slow 2.5 in HDD, probably with SMR (Shingled Magnetic Recording), not CMR, connected via USB 3 / SATA.
The disk contains incremental backups in different versions, created by cp -al and then rsynced.
dupes.txt is about 4 GB
Most files are already hardlinked, between 10 and 100 links The idea is to fix missing hardlinks caused by incomplete backups.

pkolaczk commented 2 years ago

This is quite likely caused by the fact the internal sequence of commands is generated in a single thread, but the commands are processed in parallel. So generating the stream of commands became a bottleneck and deduplication threads are simply fighting for work, actively spinning (this is how rayon works).

Generating commands was made single-threaded due to other feature request about making the stream of commands match the order of the input files. I need to find a different way.

Two ideas here:

use rayon ParallelIterator to generate commands, but record the order in each item and then put the commands in correct order when printing them
switch from rayon to async which would allow a lot more flexibility

hans-helmut commented 2 years ago

This seems to be the matching rayon issue.

pkolaczk / fclones

High CPU-usage while hardlinking files #138