[FEATURE] Makes entries being loaded in parallel

HorlogeSkynet / archey4

:computer: Maintained fork of the original Archey (Linux) system tool

https://git.io/archey4

GNU General Public License v3.0

295 stars 37 forks source link

[FEATURE] Makes entries being loaded in parallel #74

Closed HorlogeSkynet closed 4 years ago

HorlogeSkynet commented 4 years ago

This patch is mainly inspired from the work of @ingrinder (see 2bbc2dae). You may notice an execution up to twice as fast.

This behavior could be disabled with the new parallel_loading configuration option.

See #68.

How has this been tested ?

Locally & test cases.

Types of changes :

[X] New feature (non-breaking change which adds functionality)
[X] Breaking change [?] (fix or feature that would cause existing functionality to change)

Checklist :

[X] [IF NEEDED] I have updated the README.md file accordingly ;
~~[ ] [IF NEEDED] I have updated the test cases (which pass) accordingly ;~~
[X] My changes looks good ;
[X] I agree that my code may be modified in the future ;
[X] My code follows the code style of this project (PEP8).

ingrinder commented 4 years ago

Since we're executing I/O-heavy work, should we use len(Entries) (or maybe a static value, so 18 currently), as our ThreadPoolExecutor's maxcount? This would mean that in a case where nearly all entries were blocking on I/O, we'd still get at least another thread for another entry to begin its processing while waiting on the others - which should result in the quickest possible execution time. We don't really do anything heavy which is avoided by not spawning threads, and the improvements in Python 3.8 and above also mean we don't spawn more threads than ever necessary to complete all of the work in parallel.

HorlogeSkynet commented 4 years ago

You're actually right :+1: So what about mixing our needs with "new" Python 3.8+ behavior ?

            executor = cm_stack.enter_context(
-                ThreadPoolExecutor(max_workers=((os.cpu_count() or 1) * 5))
+                ThreadPoolExecutor(max_workers=min(len(enabled_entries), (os.cpu_count() or 1) + 4))
            )

Reference : https://github.com/python/cpython/pull/13618

HorlogeSkynet commented 4 years ago

Nice catch ! Thanks :slightly_smiling_face: