PUNCH-Cyber / stoq

An open source framework for enterprise level automated analysis.
https://stoq.punchcyber.com
Apache License 2.0
394 stars 55 forks source link

Workers do not seem to run in parallel #148

Closed ytreister closed 4 years ago

ytreister commented 4 years ago

Describe the bug Workers need to await a coroutine in order to run in parallel.

To Reproduce I created a demo to illustrate what I am talking about: https://github.com/ytreister/stoq/tree/workers_in_parallel/demo

You can run scan.py which show the following: All workers await a coroutine: (This is what I want it to do)

2020-04-19 19:53:19 DEBUG stoq:core.py:scan_request:599: Beginning worker round 1
2020-04-19 19:53:19 DEBUG stoq:core.py:_execute_scan_round:729: Starting scan of 3 tasks, deferring 0 to future rounds
2020-04-19 19:53:19 DEBUG stoq:core.py:_apply_worker:905: Scanning Payload 896dda8f-d649-4823-b912-7788e85ae521 with WorkerPlugin w3
2020-04-19 19:53:19 DEBUG stoq:core.py:_apply_worker:905: Scanning Payload 896dda8f-d649-4823-b912-7788e85ae521 with WorkerPlugin w1
2020-04-19 19:53:19 DEBUG stoq:core.py:_apply_worker:905: Scanning Payload 896dda8f-d649-4823-b912-7788e85ae521 with WorkerPlugin w2
2020-04-19 19:53:20 DEBUG stoq:core.py:_apply_worker:946: Completed scan of 896dda8f-d649-4823-b912-7788e85ae521 with WorkerPlugin w1 0 result keys, 0 additional dispatches, and 0 extracted payloads
2020-04-19 19:53:21 DEBUG stoq:core.py:_apply_worker:946: Completed scan of 896dda8f-d649-4823-b912-7788e85ae521 with WorkerPlugin w2 0 result keys, 0 additional dispatches, and 0 extracted payloads
2020-04-19 19:53:22 DEBUG stoq:core.py:_apply_worker:946: Completed scan of 896dda8f-d649-4823-b912-7788e85ae521 with WorkerPlugin w3 0 result keys, 0 additional dispatches, and 0 extracted payloads
2020-04-19 19:53:22 DEBUG stoq:core.py:scan_request:599: Beginning worker round 2
2020-04-19 19:53:22 DEBUG stoq:core.py:scan_request:603: No more plugins to run, completing scan

All workers do not await a coroutine: (This is not how I wanted it to work)

2020-04-19 19:53:22 DEBUG stoq:core.py:scan_request:599: Beginning worker round 1
2020-04-19 19:53:22 DEBUG stoq:core.py:_execute_scan_round:729: Starting scan of 3 tasks, deferring 0 to future rounds
2020-04-19 19:53:22 DEBUG stoq:core.py:_apply_worker:905: Scanning Payload c9f7008b-1f2f-438a-bbb3-4f934ba3422c with WorkerPlugin w3
2020-04-19 19:53:25 DEBUG stoq:core.py:_apply_worker:946: Completed scan of c9f7008b-1f2f-438a-bbb3-4f934ba3422c with WorkerPlugin w3 0 result keys, 0 additional dispatches, and 0 extracted payloads
2020-04-19 19:53:25 DEBUG stoq:core.py:_apply_worker:905: Scanning Payload c9f7008b-1f2f-438a-bbb3-4f934ba3422c with WorkerPlugin w2
2020-04-19 19:53:27 DEBUG stoq:core.py:_apply_worker:946: Completed scan of c9f7008b-1f2f-438a-bbb3-4f934ba3422c with WorkerPlugin w2 0 result keys, 0 additional dispatches, and 0 extracted payloads
2020-04-19 19:53:27 DEBUG stoq:core.py:_apply_worker:905: Scanning Payload c9f7008b-1f2f-438a-bbb3-4f934ba3422c with WorkerPlugin w1
2020-04-19 19:53:28 DEBUG stoq:core.py:_apply_worker:946: Completed scan of c9f7008b-1f2f-438a-bbb3-4f934ba3422c with WorkerPlugin w1 0 result keys, 0 additional dispatches, and 0 extracted payloads
2020-04-19 19:53:28 DEBUG stoq:core.py:scan_request:599: Beginning worker round 2

Notice how in the first each worker runs in asynchronously and in the second they run serially.

Expected behavior I think the stoQ framework works as expected, but maybe the documentation needs some more explaining so that users can take full advantage of the async feature. I do not think any of the public stoq plugins are written so that they can be run asyncronously.

Client:

Explanation Once I converted and started running my plugins in stoQ 3.x, I inspected the logs and noticed that the plugins did not seem to run in parallel during a given round. My OPSWAT metadefender plugin was the dead giveaway because it would take up to 1 minute per file.

I created a demo to illustrate what happens when the plugins await a co-routine that takes some time to execute versus call a regular function that takes some time to execute. As I expected, when awaiting a coroutine (in my demo this happens when I pass b'asyncio' as the payload) the worker plugins execute asynchronously. When my worker plugins do not await a co-routine, they basically run in serial.

All of the stoq-plugins-public seem to not await a co-routine. For example opswat.py public plugin should maybe use one of the techniques described here: https://stackoverflow.com/questions/22190403/how-could-i-use-requests-in-asyncio

so that the framework does not have to wait for the scan to complete before other worker plugins begin their scan.

Another example might be a plugin that runs some local command such as exif.py or trid.py. These both call subprocess, so perhaps something from here: https://docs.python.org/3/library/asyncio-subprocess.html

It might not take very long for these function to execute, but I thought one of the main reason to us asyncio is so that we do not have to wait for other unrelated workers to run.

For me, my plugins either

  1. Run a subprocess (i.e exif.py)
  2. Run python by loading a python library (i.e.peinfo.py, hash_ssdeep.py)
  3. Call a RESTful API (i.e. opswat.py)

I think there should be at least guidance for how to setup each of the above types of workers so that they can run asynchronously

mlaferrera commented 4 years ago

You are correct, very few plugins have asyncio support right now. This is a work in progress, and I am more than happy to accept PRs to help with that, along with documentation improvements.

ytreister commented 4 years ago

I think at the minimum the documentation should explain this for WorkerPlugins and it should maybe make it clear for the section where it talks about converting worker plugins from v2-> v3.

ytreister commented 4 years ago

I created two PRs exif and opswat...