GuillaumeGomez / sysinfo

Cross-platform library to fetch system information
MIT License
2.1k stars 313 forks source link

multithreading feature performance issues #1352

Open Vortetty opened 1 month ago

Vortetty commented 1 month ago

Describe the bug Issue is present on a cachyos, ubuntu, and gentoo system. untested on windows/mac. In testing, it appears for short running programs that call sysinfo, the rayon overhead can double or triple runtime of a program, which is undocumented as is the multithreading option. All testing was done in release mode on version 0.31.4 with nothing except a terminal running, and benchmarked using hyperfine.

multithreading Disabled:

Benchmark 1: target/release/yatfpbnws
  Time (mean ± σ):      52.3 ms ±   2.6 ms    [User: 12.5 ms, System: 39.2 ms]
  Range (min … max):    49.3 ms …  60.9 ms    100 runs

multithreading Enabled:

Benchmark 1: target/release/yatfpbnws
  Time (mean ± σ):     226.2 ms ±  20.8 ms    [User: 35.6 ms, System: 147.3 ms]
  Range (min … max):   179.4 ms … 286.4 ms    100 runs

To Reproduce https://github.com/Vortetty/YATFPBNWS/tree/master Changing sysinfo to use default features results in a runtime increase of 4x in the above example on my main system (9900x/32gb ram) example only runs on linux systems, and may not work on all of them as it is still being worked on, but seems to work on my 3 test systems fine.

GuillaumeGomez commented 1 month ago

That is pretty bad indeed. Now I need some extra info: which API are you using and with which arguments? The idea would be to only trigger multi-threading above a given threshold.

Vortetty commented 1 month ago

Total counts of all calls made are:

edits: missed a few calls

Vortetty commented 1 month ago

one thing i did notice last night testing, it seems that on my weaker/slower/less beefy laptop, the time difference is alot less drastic the beefier the system the more drastic the time difference is, perhaps the thread creation overhead is longer than the syscalls are in total unless doing full refreshes on slow systems

GuillaumeGomez commented 1 month ago

The only API here using multi-threading is System::new_with_specifics(RefreshKind::new().with_processes(ProcessRefreshKind::everything()));. One question about your code: are you creating System more than once?

Vortetty commented 1 month ago

i am not, it's only instanced once in the whole program, and only refreshed on creation

GuillaumeGomez commented 1 month ago

I'm starting to be out of ideas. :sweat_smile:

Is the total runtime of your program faster or slower with multithreading enabled? It's normal that it uses more resources and system time but overall, it should still run faster (hopefully).

Vortetty commented 1 month ago

It's slower with the multithreading (as seen in the benchmarks above) overall disabling it saves 150ms on the program runtime on my main rig, i can test on a secondary more average pc here in a bit to make sure that holds up. in the benchmarks above you can also see with multithreading on there's 20ms extra spent in user mode (likely from the overhead of rayon) as well as the 110ms extra spent in system calls (likely threads, waiting for their creation and destruction) in longer-running programs it would definitely be less noticeable, especially if you made the calls super often since rayon uses work stealing to prevent re-initializing things every time, but that would need a very specific call order and calls that it can perform work stealing on effectively, which it may not be able to with simple system calls.

Vortetty commented 1 month ago

running the same test on an hp laptop with a ryzen 5300U and crystal linux, it's 173.7ms with threading, and 69.6ms without (both averaged over 100 runs). so the threading slowdown can be reproduced even on low power systems

Vortetty commented 1 month ago

windows with multithread: 154.9ms windows without multithread: 162.8ms

seems to be a linux specific issue. all that is needed for the test is:

[package]
name = "multithreadtest"
version = "0.1.0"
edition = "2021"

[dependencies]
sysinfo = {version="0.31.4", default-features=false, features=["component", "disk", "network", "system", "user"]}
use sysinfo::{ProcessRefreshKind, RefreshKind, System};

fn main() {
    let sys = System::new_with_specifics(RefreshKind::new().with_processes(ProcessRefreshKind::everything()));
    println!("{:?}", sys);
}

on android through termux it seems to not matter which is used, may change if used in an app if anyone with mac can test to see as well?