cuckoosandbox / cuckoo

Cuckoo Sandbox is an automated dynamic malware analysis system
http://www.cuckoosandbox.org
Other
5.52k stars 1.7k forks source link

Cuckoo distributed not working #1820

Open ghost opened 7 years ago

ghost commented 7 years ago

Hi,

I have issues running cuckoo distributed. Already went through the documentation [1], have been debugging this now for two days already. I need help :(

I am running a "master" node for storage and web services and a "slave" node running cuckoo daemon and my analysis VMs. My master node is not submitting any tasks to my slave node when running distributed.

My steps:

  1. Created a new postgresql db for distributed services on my master node.
  2. Executed "cuckoo distributed server" on my master node.
  3. Submitted the REST API of my master node and my slave node to the distributed API:

    master node curl http://10.0.0.1:9000/api/node -F name=cuckoo1 -F url=http://10.0.0.1:8090/

    slave node curl http://10.0.0.1:9000/api/node -F name=cuckoo2 -F url=http://10.0.0.2:8090/

When submitting files to the REST API of my master node, I receive a task ID, nothing happens though. When submitting files to the distributed API of my master node, I also receive a task ID, nothing happens. The servers are able to reach each other - zero networking issues.

Are there any steps necessary to make things happen? Do i also need to start cuckoo distributed on my slave? What is the command "cuckoo distributed instance" for? It is undocumented. Thank you very much in advance.

ghost commented 7 years ago

I think I managed to get it working by starting the distributed worker via supervisord. When I submit my file to the distributed REST API (cuckoo distributed server), it now gets forwarded to my cuckoo node where the cuckoo daemon is running. Is this the right workflow though or do I need to submit my file to the "normal" cuckoo REST API of my master node?

Previously I used cuckoo-modified (https://github.com/spender-sandbox/cuckoo-modified). I had to submit my files to the cuckoo REST API there. Analysis reports would appear in the web frontend once the analysis has been finished and fetched by cuckoo distributed. Does cuckoo sandbox' web frontend also support analysis reports fetched by distributed cuckoo?

Thank you for any hints.

daanfs commented 7 years ago

Running the worker via supervisord and submitting your samples to the distributed API is indeed the right workflow.

The instance command is no longer needed.

doomedraven commented 7 years ago

you can easilly backport my dist.py from dist which fetch stuff and stores in mongo to v2 https://github.com/doomedraven/cuckoo-modified/blob/master/utils/dist.py https://github.com/doomedraven/cuckoo-modified/blob/master/docs/book/src/usage/dist.rst

BUGSHIDO commented 6 years ago

@Ciriio I'm having the same problem as you are having, thing is that supervisord is trying to start a cuckoo daemon but since i'm using that machine only for storage and management i don't have it set-up.

did you go under any special setup?

najashark commented 6 years ago

@Ciriio i got same initial problem as you and I exactly did like your steps above and my task all pending.

Tried to run using supervisord like you mention but how do you run using supervisord? when i try run cuckoo distibuted using supervisorctl start distributed form $CWD i got this error unix:///home/cuckoo/.cuckoo/supervisord/unix.sock no such file

when i tried create unix.sock file manually based on discussion here i got unix:///home/cuckoo/.cuckoo/supervisord/unix.sock refused connection

reox commented 6 years ago

I have a similar problem here: I set up two machines. Both run cuckoo and the REST API. I start the API via uwsgi and nginx and cuckoo via a systemd unit. I can confirm that the API is up and running on both machines. both workers are added to the API but when I call curl localhost:9003/api/status I get this:

{
  "dist": {
    "diskspace": {
      "reports": {
        "free": 28842295296, 
        "total": 101876621312, 
        "used": 73034326016
      }, 
      "samples": {
        "free": 28842295296, 
        "total": 101876621312, 
        "used": 73034326016
      }
    }
  }, 
  "nodes": {}, 
  "success": true, 
  "tasks": null, 
  "timestamp": 1530875004
}

Also when I go to the distributed web interface both nodes have a red icon. Does this mean that the nodes are not reachable? I tried to refresh the nodes via curl -XPOST localhost:9003/api/node/localhost/refresh but still no nodes show up in the status.

When I add now tasks via the distributed API, they return success for queuing but do not show up on any of the workers. According to the documentation this should be all what should be done. Do I miss something here?

reox commented 6 years ago

@daanfs you said that the instance command is not required, but It looks like I need to start cuckoo distributed instance <name_of_node> for each node I have. If I do this though, the node turns green in the interface and also shows up in the status. So do I need to run the instance thing for each node as well on the distributed server machine?

But tasks are still not put into the workers... The task is still pending and no node_id is set.

reox commented 6 years ago

Okay I think I misunderstood the documentation a little bit... The distributed REST API and the distributed worker are actually different things. So you actually have to run 5 (6) different things on the distributed server:

On a client machine, you just need 3 (4) things:

Is this correct? So using the cuckoo.distributed.worker should start the instances automatically? Thanks for clarification!

I drew this ASCII image. If it is correct, someone might want to add this to the documentation. :)

+=========================+          +=========================+
|      Host: cuckoo0      |          |      Host: cuckoo1      |
+-------------------------+          +-------------------------+
| local Databases:        |          | local Databases:        |
|    cuckoo               |          |    cuckoo               |
|    distributed          |          |                         |
+-------------------------+          +-------------------------+
| Services:               |          | Services:               |
|    Cuckoo Daemon        |          |    Cuckoo Daemon        |
|    Cuckoo REST API      |          |    Cuckoo REST API      |
|    Process n times      |          |    Process n times      |
|    Distributed REST API |          |                         |
|    Distributed Worker   |          |                         |
+=========================+          +=========================+

Cuckoo Daemon: `cuckoo`
Cuckoo REST API: `cuckoo api`
Process: `cuckoo process process<n>` (Can be started multiple times)
Distributed REST API: `cuckoo distributed server`
Distributed Worker: `supervisorctl start distributed`

Edit: I tested now, that starting the cuckoo.distributed.worker with systemd works fine and the samples are now submitted and reports are fetched.

noobiecoding8 commented 5 years ago

@reox

The Distributed Worker (/home/cuckoo/cuckoo/bin/python -m cuckoo.distributed.worker) Please explain this step I am unable to start this .

Can't found such options.

Thanks

reox commented 5 years ago

@noobiecoding8 you need to use the right python executable... /home/cuckoo/cuckoo was an example for the venv I used.

noobiecoding8 commented 5 years ago

@noobiecoding8 you need to use the right python executable... /home/cuckoo/cuckoo was an example for the venv I used.

Hi @reox , thanks for your kind reply, everything is working fine, nodes registered & status active(green) , also disk and memories are showing but when i submit the files, it only creates task ids but not processing them, all tasks remain in pending state. How to solve this

username@user:~/.cuckoo$ supervisorctl -c supervisord.conf cuckoo:cuckoo-daemon FATAL Exited too quickly (process log may have details) cuckoo:cuckoo-process_0 RUNNING pid 23395, uptime 0:05:22 cuckoo:cuckoo-process_1 RUNNING pid 23398, uptime 0:05:22 cuckoo:cuckoo-process_2 RUNNING pid 23397, uptime 0:05:22 cuckoo:cuckoo-process_3 RUNNING pid 23396, uptime 0:05:22 distributed STOPPED Not started supervisor>

and if I execute directly like this i got another error username@user:~/.cuckoo$ supervisorctl start distributed distributed: ERROR (spawn error)

reox commented 5 years ago

@noobiecoding8 I do not use supervisorctl, but the systemd units from #2025 But it looks like that your daemon does not start - thus there is no worker to send samples to.

noobiecoding8 commented 5 years ago

@reox I am able to start cuckoo-daemon , but unable to start Distributed Worker: `supervisorctl start distributed its giving following error distributed: ERROR (spawn error) do i have to add some extra files in cwd ? i even can't find directory /home/cuckoo/cuckoo/bin/

Mrqlxdd commented 5 years ago

@reox Is cuckoo distributed instance necessary?

reox commented 5 years ago

sorry, i have no clue about the supervisorctl, I use systemd now. It works fine and the unit files are given in another issue, see the link above.