kevoreilly / CAPEv2

Malware Configuration And Payload Extraction
https://capesandbox.com/analysis/
Other
1.84k stars 398 forks source link

Az.conf missing `machines` key. #2228

Closed ChrisThibodeaux closed 1 month ago

ChrisThibodeaux commented 1 month ago

About accounts on capesandbox.com

This is open source and you are getting free support so be friendly!

Prerequisites

Please answer the following questions for yourself before submitting an issue.

Expected Behavior

Running the cape service or poetry run python3 cuckoo.py should bring up start CAPE using the Azure machinery.

Current Behavior

Startup fails with a KeyError: 'machines'. The same error is present when running sudo systemctl restart cape or poetry run python3 cuckoo.py as the cape user.

Failure Information (for bugs)

The az.conf file has no key machines. It appears to be the only machinery file without it.

Steps to Reproduce

Set your machinery in cuckoo.conf to az. Spin CAPE up.

Context

Deploying via the Azure recommendation in the CAPE docs. I am using auto-scaling. I have been using a work-around for this issue by placing in machines = with an empty value as shown in aws.conf when auto-scaling, but I have not been able to scale beyond a single instance on the VMSS when doing so. Setting it to machines = <my_vmss_name>, the VMSS will scale and then immediately delete the new instances.

I do not have the VMSS up and running before running CAPE, I am allowing the machinery to instantiate it instead.

What am I missing here? I searched through the repo's issues and double checked the docs/changelog. I have not seen anyone mention this problem, so it feels like I must be doing something wrong.

Question Answer
Git commit 722aced01eb7fbe976edb3ce372895deb0fb2106
OS version Windows 10

Failure Logs

Jul 14 18:43:04 hostname systemd[1]: Started CAPE.
...
Jul 14 18:43:07 hostname python3[27408]: Traceback (most recent call last):
Jul 14 18:43:07 hostname python3[27408]:   File "/opt/CAPEv2/cuckoo.py", line 143, in <module>
Jul 14 18:43:07 hostname python3[27408]:     cuckoo_main(max_analysis_count=args.max_analysis_count)
Jul 14 18:43:07 hostname python3[27408]:   File "/opt/CAPEv2/cuckoo.py", line 102, in cuckoo_main
Jul 14 18:43:07 hostname python3[27408]:     sched = Scheduler(max_analysis_count)
Jul 14 18:43:07 hostname python3[27408]:   File "/opt/CAPEv2/lib/cuckoo/core/scheduler.py", line 66, in __init__
Jul 14 18:43:07 hostname python3[27408]:     self.machinery_manager = MachineryManager() if categories_need_VM else None
Jul 14 18:43:07 hostname python3[27408]:   File "/opt/CAPEv2/lib/cuckoo/core/machinery_manager.py", line 150, in __init__
Jul 14 18:43:07 hostname python3[27408]:     self.machinery: Machinery = self.create_machinery()
Jul 14 18:43:07 hostname python3[27408]:   File "/opt/CAPEv2/lib/cuckoo/core/machinery_manager.py", line 204, in create_machinery
Jul 14 18:43:07 hostname python3[27408]:     machinery: Machinery = plugin()
Jul 14 18:43:07 hostname python3[27408]:   File "/opt/CAPEv2/lib/cuckoo/common/abstracts.py", line 118, in __init__
Jul 14 18:43:07 hostname python3[27408]:     self.set_options(self.read_config())
Jul 14 18:43:07 hostname python3[27408]:   File "/opt/CAPEv2/lib/cuckoo/common/abstracts.py", line 129, in set_options
Jul 14 18:43:07 hostname python3[27408]:     if not isinstance(mmanager_opts["machines"], list):
Jul 14 18:43:07 hostname python3[27408]: KeyError: 'machines'
Jul 14 18:43:08 hostname systemd[1]: cape.service: Main process exited, code=exited, status=1/FAILURE
Jul 14 18:43:08 hostname systemd[1]: cape.service: Failed with result 'exit-code'.
doomedraven commented 1 month ago

you will wait to someone from community to answer this

leoiancu21 commented 1 month ago

@ChrisThibodeaux could you print the postgres DB content (redact what you need), this could be similar to something that i fixed on my private fork weeks ago, if it's the same situation i think that i could help you

doomedraven commented 1 month ago

why do you not contribute fix from your fork to mainstream so that would be fixed for everyone?

leoiancu21 commented 1 month ago

I have to fix it in a clean way, I'm not an amazing coder but once I fix it completly I will contribute for sure, always a pleasure

doomedraven commented 1 month ago

we always can help to improve code

ChrisThibodeaux commented 1 month ago

@leoiancu21 I will have to circle back with that later this evening. Am I correct in assuming that there should not be a machines key in the conf?

leoiancu21 commented 1 month ago

Yep, not in the az.conf at least, this is something added later in the DB by az.py

ChrisThibodeaux commented 1 month ago

@leoiancu21 Here is the postgres DB printout. I uploaded some files to fill out a few of the tables, but this is an otherwise fresh DB.

Table: machines
(1, 'cape-guest-vmss_0', 'cape-guest-vmss_0', 'x64', '10.2.2.5', 'windows', 'eth1', '/subscriptions/<sub>/resourceGroups/<rg>/providers/Microsoft.Compute/galleries/<img_gallery>/images/<win10_image>', True, datetime.datetime(2024, 7, 15, 18, 42, 6, 989599), 'running', datetime.datetime(2024, 7, 15, 18, 42, 6, 989607), '10.2.2.4', '2042', False)

Table: machines_tags
(1, 1)

Table: tags
(1, 'win10x64')
(2, 'x86')

Table: tasks
(2, '/tmp/cuckoo-tmp/upload_nrxinrc3/8aaf4f7675fcef71e076324c', 'file', '', 200, 1, '', None, 'exe', 'none', '', '', 'windows', False, False, datetime.datetime(2024, 7, 15, 18, 42, 6), datetime.datetime(2024, 7, 15, 18, 42, 6, 617108), None, None, 'pending', None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, False, 2, None, None, None, None, None, None, '', 0, 'false')
(3, '/tmp/cuckoo-tmp/upload_lqdx8txp/d2c8dd75fbcfe40951e5dc19', 'file', '', 200, 1, '', None, 'exe', 'none', '', '', 'windows', False, False, datetime.datetime(2024, 7, 15, 18, 42, 6), datetime.datetime(2024, 7, 15, 18, 42, 6, 630961), None, None, 'pending', None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, False, 3, None, None, None, None, None, None, '', 0, 'false')
(4, '/tmp/cuckoo-tmp/upload_72jsh8tf/3cea42af2b2fa5b2e42516c4', 'file', '', 200, 1, '', None, 'exe', 'none', '', '', 'windows', False, False, datetime.datetime(2024, 7, 15, 18, 42, 6), datetime.datetime(2024, 7, 15, 18, 42, 6, 645081), None, None, 'pending', None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, False, 4, None, None, None, None, None, None, '', 0, 'false')
(5, '/tmp/cuckoo-tmp/upload_iqy2ao7c/fe1d75d4b4de68a9ec944ae1', 'file', '', 200, 1, '', None, 'exe', 'none', '', '', 'windows', False, False, datetime.datetime(2024, 7, 15, 18, 42, 6), datetime.datetime(2024, 7, 15, 18, 42, 6, 658432), None, None, 'pending', None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, False, 5, None, None, None, None, None, None, '', 0, 'false')
(1, '/tmp/cuckoo-tmp/upload_hy8qxfg9/ec6cd559f2966d6db5f4f734', 'file', '', 200, 1, '', 'cape-guest-vmss_0', 'exe', 'none', '', '', 'windows', False, False, datetime.datetime(2024, 7, 15, 18, 42, 6), datetime.datetime(2024, 7, 15, 18, 42, 6, 597768), datetime.datetime(2024, 7, 15, 18, 42, 6, 989458), None, 'running', None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, False, 1, 1, None, None, None, None, None, '', 0, 'false')

Table: tasks_tags
(1, 2)
(1, 1)
(2, 2)
(2, 1)
(3, 2)
(3, 1)
(4, 2)
(4, 1)
(5, 2)
(5, 1)

Table: guests
(1, 'running', 'cape-guest-vmss_0', 'cape-guest-vmss_0', 'windows', 'Azure', datetime.datetime(2024, 7, 15, 18, 42, 6, 993422), None, 1)

Table: errors

And in case it matters, the outputs for machines/machine tags when the VMSS brings up more instances:

Table: machines
(1, 'cape-guest-vmss_0', 'cape-guest-vmss_0', 'x64', '10.2.2.5', 'windows', 'eth1', '/subscriptions/<sub>/resourceGroups/<rg>/providers/Microsoft.Compute/galleries/<img_gallery>/images/<win10_image>', True, datetime.datetime(2024, 7, 15, 18, 42, 6, 989599), 'running', datetime.datetime(2024, 7, 15, 18, 42, 6, 989607), '10.2.2.4', '2042', False)
(4, 'cape-guest-vmss_3', 'cape-guest-vmss_3', 'x64', '10.2.2.8', 'windows', 'eth1', '/subscriptions/<sub>/resourceGroups/<rg>/providers/Microsoft.Compute/galleries/<img_gallery>/images/<win10_image>', True, datetime.datetime(2024, 7, 15, 18, 43, 23, 250052), 'running', datetime.datetime(2024, 7, 15, 18, 43, 23, 250059), '10.2.2.4', '2042', False)
(3, 'cape-guest-vmss_2', 'cape-guest-vmss_2', 'x64', '10.2.2.7', 'windows', 'eth1', '/subscriptions/<sub>/resourceGroups/<rg>/providers/Microsoft.Compute/galleries/<img_gallery>/images/<win10_image>', True, datetime.datetime(2024, 7, 15, 18, 43, 23, 262671), 'running', datetime.datetime(2024, 7, 15, 18, 43, 23, 262678), '10.2.2.4', '2042', False)
(2, 'cape-guest-vmss_1', 'cape-guest-vmss_1', 'x64', '10.2.2.6', 'windows', 'eth1', '/subscriptions/<sub>/resourceGroups/<rg>/providers/Microsoft.Compute/galleries/<img_gallery>/images/<win10_image>', True, datetime.datetime(2024, 7, 15, 18, 43, 23, 279389), 'running', datetime.datetime(2024, 7, 15, 18, 43, 23, 279397), '10.2.2.4', '2042', False)

Table: machines_tags
(1, 1)
(2, 1)
(3, 1)
(4, 1)

Current config in az.py:

...
scale_set_limit = 5
total_machines_limit = 5
...
scale_sets = cape-guest-vmss
...
machines = cape-guest-vmss

[cape-guest-vmss]
gallery_image_name = <win10_image>
platform = windows
arch = x64
pool_tag = win10x64
initial_pool_size = 1

Is there something in particular that I can look into with the DB? If my print format is not ideal, I can add in the column names before the values (learning as I go).

ChrisThibodeaux commented 1 month ago

I can also confirm that the DB is not removing the instances cape-guest-vmss_ 1, 2, or 3 from the DB when Azure destroys them. Those machines and the tasks they were given remain with a running state, with the tasks never timing out. Any submissions made after this point remain always at pending. To get further submissions to run, I have to use the cleaners.py util and restart cape.

leoiancu21 commented 1 month ago

Could you try changing this function in abstracts.py :

    def set_options(self, options: dict) -> None:
        """Set machine manager options.
        @param options: machine manager options dict.
        """
        self.options = options
        mmanager_opts = self.options.get(self.module_name)
        log.debug("SFSG : mmanager_opts :")
        log.debug(mmanager_opts["scale_sets"])
        #if not isinstance(mmanager_opts["machines"], list):
        #    mmanager_opts["machines"] = str(mmanager_opts["machines"]).strip().split(",")
        log.debug("SFSG : isinstance mmanager_opts r133:")
        log.debug(isinstance(mmanager_opts["scale_sets"], str))
        if not isinstance(mmanager_opts["scale_sets"], str):
            mmanager_opts["scale_sets"] = str(mmanager_opts["scale_sets"]).strip().split(",")

check the logging statements when launching cuckoo.py -d and clean the db correctly to remove broken relations

ChrisThibodeaux commented 1 month ago

Added in the debug/commented lines and removed the machines key from my az.conf. With those changes, here is the output running -d:

2024-07-15 22:58:50,299 [root] DEBUG: Checking for locked tasks...
/usr/bin/tcpdump
2024-07-15 22:58:50,331 [lib.cuckoo.common.abstracts] DEBUG: SFSG : mmanager_opts :
2024-07-15 22:58:50,332 [lib.cuckoo.common.abstracts] DEBUG: cape-guest-vmss
2024-07-15 22:58:50,332 [lib.cuckoo.common.abstracts] DEBUG: SFSG : isinstance mmanager_opts r133:
2024-07-15 22:58:50,332 [lib.cuckoo.common.abstracts] DEBUG: True
2024-07-15 22:58:50,332 [lib.cuckoo.core.machinery_manager] INFO: Using MachineryManager[az] with max_machines_count=0
2024-07-15 22:58:50,332 [lib.cuckoo.core.scheduler] INFO: Creating scheduler with max_analysis_count=unlimited
...
2024-07-15 22:59:09,172 [modules.machinery.az] DEBUG: Adding machines to database for cape-guest-vmss.
...
2024-07-15 22:59:10,521 [modules.machinery.az] DEBUG: cape-guest-vmss_0: Initializing...
2024-07-15 22:59:20,533 [modules.machinery.az] DEBUG: cape-guest-vmss_0: Initializing...
2024-07-15 22:59:30,545 [modules.machinery.az] DEBUG: cape-guest-vmss_0: Initializing...
2024-07-15 22:59:40,557 [modules.machinery.az] DEBUG: Machine cape-guest-vmss_0 was created and available in 31s
2024-07-15 22:59:40,569 [lib.cuckoo.core.machinery_manager] INFO: Loaded 1 machine
2024-07-15 22:59:40,596 [lib.cuckoo.core.machinery_manager] INFO: upper limit for ScalingBoundedSemaphore = 5
2024-07-15 22:59:40,599 [lib.cuckoo.core.scheduler] INFO: Waiting for analysis tasks

Edit: Probably obvious, but when I remove the machines key, the submission page on the web GUI breaks. Offending line is:

else:
    # Get VM names for machinery config elements
    vms = [x.strip() for x in str(getattr(Config(machinery), machinery).get("machines")).split(",") if x.strip()]
leoiancu21 commented 1 month ago

Sure, forgot about it, use scale_sets instead of machines here too :

machinery).get("scale_sets")).split(",")

Instead of machines in that too, since you have scale sets they will be treated differently than normal machines, i will work on something that matches az machinery and switches to this logic automatically so I won't break anything.

Tell me if this works for you

ChrisThibodeaux commented 1 month ago

@leoiancu21 Working for me. I can now remove the machines key in the az.conf. Thank you very much for the help on that, it was driving me nuts!

leoiancu21 commented 1 month ago

Always a pleasure, I will work for a fix to push as soon as I can, still have to fix a module that is not working so I'll be a bit busy these days

leoiancu21 commented 1 month ago

@ChrisThibodeaux if this issue is now fixed i would ask you to close it since we already have the pull request of the edited az.py to work with, we can talk about it and fix it there