VirusTotal / yara-x

A rewrite of YARA in Rust.
https://virustotal.github.io/yara-x/
BSD 3-Clause "New" or "Revised" License
565 stars 46 forks source link

Implement CPU throttling for the CLI #115

Open Meshka1337 opened 1 month ago

Meshka1337 commented 1 month ago

Pleas fix the high CPU utilization image

plusvic commented 1 month ago

Please provide more context about the issue. High CPU utilization is expected, and depends a lot on the number and characteristics of the rules being used. Is the CPU usage a lot higher with yara-x when compared to yara with the same rules?

If so, can you provide the rules that you are using?

Meshka1337 commented 1 month ago

Dear Plusvic,

It appears that the utilization of the old YARA is similar to that of YARA-X. When we initially heard about the transition to YARA-X, we were hoping for additional functionality to control CPU usage, similar to what is implemented in Thor Scanner. The high CPU utilization is causing server crashes and resulting in downtime. We believe that implementing a CPU cap is not overly complex and would greatly help mitigate this issue. By limiting the CPU usage, we can prevent server crashes and ensure smooth operations.

We acknowledge the influence of YARA rule complexity on CPU usage. However, we believe implementing a CPU usage limit would be a valuable addition to mitigate potential issues.

We kindly request your attention to this matter and appreciate your efforts in resolving the high CPU utilization problem.

plusvic commented 1 month ago

Can you provide more details about how that feature would look like to you? I'm also interested in knowing how the CPU throttling feature works in Thor Scanner. Any context or additional information about what you are really expecting is welcomed. I need to have a mental model of what are you really asking for here.

One thing to note is that you can specify the number of threads used for scanning (both in YARA and YARA-X) with the --threads option. By using one thread you can have one CPU at 100%, but the rest will be available to be used by other programs and the overall CPU usage will be much lower.

Reducing the CPU usage even with a single thread is a bit more complex, but I guess that could be achieved by introducing small delays between each file scan. What you want is something like that?

Meshka1337 commented 1 month ago

As Thor scanner highly overview describe "--cpulimit This argument will take an integer (default 95; minimum 15), which represents the maximum CPU load at which THOR will be actively scanning. The value can be seen as percentage of the systems maximum CPU load. The specified value instructs THOR to pause (all scanning), if the load of the systems CPU is higher than the cpulimit. One example would be, if a user is doing something CPU intensive, and THOR is running at the same time, THOR will pause and wait until the CPU load drops below the cpulimit before continuing. "

"https://github.com/NextronSystems/thor-manual/blob/master/usage/configuration.rst#cpu-limit---cpulimit"

plusvic commented 1 month ago

Is Thor Scanner open source? I was trying to find the source code too see how they implement that feature, but didn't find anything. It doesn't look like a trivial feature that can be implemented without some help from the operating system, and that probably means different code for each platform.

plusvic commented 1 month ago

I've found two Rust crates (https://docs.rs/sysinfo/latest/sysinfo/struct.Cpu.html, https://docs.rs/systemstat/latest/systemstat/data/struct.CPULoad.html) that could do the heavy lifting, providing CPU load information in multiple platforms.

Meshka1337 commented 1 month ago

Thank you you are the best

plusvic commented 1 month ago

Don't close the issue, I need it as a reminder of the things that are still pending.