Closed malvidin closed 4 years ago
Times from running time stoq run
on 100 small files:
# master
real 0m11.193s
user 0m27.326s
sys 0m2.391s
# master with 0.1 second delay:
real 0m40.609s
user 0m22.279s
sys 0m1.989s
# async_yara
real 0m11.295s
user 0m27.463s
sys 0m2.488s
# async_yara with 0.1 second delay:
real 0m12.015s
user 0m26.457s
sys 0m2.412s
Again, thanks @malvidin! Your help is greatly appreciated! I'm seeing about a drastic improvement on performance similar to your test cases above.
I've updated it slightly to leverage AsyncGenerators for efficiency. Please take a look and let me know your thoughts. If you are good with it, I'll go ahead and merge.
I prefer the change to use AsyncGenerators.
I am concerned about the consistency of using executors. Currently in this pull request, it runs in an executor that uses the default thread pool. The hash plugin, which is CPU bound after the Payload is loaded into memory, does not run in a process pool. The exif plugin, which is also CPU bound, is executed with a subprocess.
I do not see an option similar/related to provider_consumers that defines how many processes and threads can be assigned, and I have not checked to see if subprocess execution will impact any thread/process configuration.
Perhaps I am misunderstanding something, I don't quite understand your concern about the consistency of using executors. Admittedly, it is probably better practice to not use the default thread pool, so I can go ahead and update that to leverage concurrent.futures.ThreadPoolExecutor
. Objects aren't modified within the plugin, but rather returned to the core framework to be updated with their response which will maintain consistency of results. Do you mind elaborating on your concern a bit?
Thanks, again @malvidin. I've run a few tests, and it looks like this version is considerably faster. I will be merging it shortly.
Old:
________________________________________________________
Executed in 6.01 secs fish external
usr time 5.62 secs 432.00 micros 5.62 secs
sys time 0.39 secs 52.00 micros 0.39 secs
New:
________________________________________________________
Executed in 3.47 secs fish external
usr time 5.70 secs 383.00 micros 5.70 secs
sys time 0.42 secs 46.00 micros 0.42 secs
The YARA match function is blocking, this pull wraps it in an event loop.
To test it, I used the following to simulate a slow YARA match.
Would it be better prefer to include a decorator in the stoq utils to replicate this behavior in situations where the plugin cannot be converted to support asyncio?