pkolaczk / fclones

Efficient Duplicate File Finder
MIT License
1.91k stars 71 forks source link

Don't scan sysfs on startup #69

Closed pkolaczk closed 3 years ago

pkolaczk commented 3 years ago

To set proper concurrency levels, fclones needs to know the parent physical device name storing given partition. We don't want to send parallel requests to a single rotary drive when the files are stored in multiple partitions of it. Unfortunately block_utils which was used to get the parent device performed an excessive number of traversals of sysfs, which slowed down fclones initialization.

This commit switches from querying sysfs to a heuristic based on Linux device naming conventions. It might not be as accurate as the old function, but an inaccuracy there can at worst cause incorrect number of threads accessing the device. Fortunately, if that ever happens, the user can manually override the sizes of the thread pools.