Documentation: Suggest limiting core count on very-multicore machines to avoid kernel bug?

charles-dyfis-net commented 5 years ago

On a 2-socket, 48-core system I've consistently had my I/O subsystem irrecoverably hang in less than 24 hours of operation; reproduced both with kernel 4.14.78 and 4.18.16. Using --thread-count 4 makes this go away.

Perhaps we should either:

Prominently advise using --thread-count on very-multicore hardware in the "kernel bugs" wiki page
Have a (default?) maximum number of threads limiting the effects of --thread-factor

Zygo commented 5 years ago

4.14.81 and 4.19.2 contain a lot of btrfs fixes which might help...but more likely won't.

I'm OK with a doc change.

kakra commented 5 years ago

Does it make sense to have so much IO threads running? Should there be an upper limit? I wonder if there's any sense in having more than disk_count to 2*disk_count threads running for IO. Btrfs is assigning stripes by PID modulo disk_count currently, so with an upper bound of 2x the disk count, we are probably putting IO to all disks at once already.

But probably still makes sense for the number crunching threads like hashing...

Zygo commented 5 years ago

The IO threads are mixed in with the hashing threads, so it's a bit of a mess at the moment. To really get IO and hashing usefully separated we'd need to rewrite most of the code, implement multiple distinct thread pools and a scheduler, and we'd need one scheduler for spinning disks and a different one for SSDs. And then it would all go to hell if we ever found a match for anything (suddenly we need locks on multiple filesystem objects and disks, probably just end up effectively single-threaded across the entire filesystem).

Experimentally I've found bees goes a little faster if the worker thread count is higher than the disk count, but no faster (maybe even a little slower) if the worker thread count is higher than the CPU core count (at least for the first 8 cores).

I've also found that things like limiting the number of threads executing dedupe or LOGICAL_INO ioctls might help with system stability (though the benefit is small compared to noise in my test environment). Perhaps more --workaround-* options are in order for performance-vs-danger tradeoffs.

I can put in a soft limit, so --thread-factor would use no more than 8 cores. --thread-count would still let the user use any number they liked.

kakra commented 5 years ago

So could a number like (disk_count+hardware_concurrency)/2 be good heuristic? For the system in example it would still create at least 12 threads, wondering if that would be stable then?

Zygo commented 5 years ago

I wouldn't try to guess without running a lot of performance experiments on specific hardware configurations. Even if we did that, changing the bees code could instantly invalidate all that data. There are huge gains still possible from relatively small code changes, and I have big code changes planned too.

The number of workers is configurable, and the default (after adding a soft limit for people with huge multi-socket systems) works OK. Users who know better can change it or test assorted values.

Zygo commented 5 years ago

We have a core-count limit (the second option in the original issue). Can we close this?

charles-dyfis-net commented 5 years ago

I certainly consider it fully addressed.

Zygo / bees

Documentation: Suggest limiting core count on very-multicore machines to avoid kernel bug? #91