cuckoosandbox / cuckoo

Cuckoo Sandbox is an automated dynamic malware analysis system
http://www.cuckoosandbox.org
Other
5.56k stars 1.7k forks source link

String extraction does not work on Windows host #2425

Open in2etv opened 6 years ago

in2etv commented 6 years ago

Thanks for creating an issue! But first: did you read our community guidelines? https://cuckoo.sh/docs/introduction/community.html

My issue is:

String extraction does not work on Windows host. but, No errors are displayed in log.

I copied the processing routine in processing/strings.py and wrote the test code. Here :

import re
import sys

MAX_FILESIZE = 16 * 1024 * 1024

def extractString(sFilePath):
    data = open(sFilePath, 'r').read(MAX_FILESIZE)

    strings = re.findall('[\x1f-\x7e]{6,}', data)
    for s in re.findall('(?:[\x1f-\x7e][\x00]){6,}', data):
        strings.append(s.decode('utf-16le'))
    return strings

if __name__ == '__main__':
    print('target : {0}'.format(sys.argv[1]))
    print(len(extractString(sys.argv[1])))

Windows :

Linux :

If other people have not experienced this problem, I think it's probably a code page issue with Windows language settings. (I'm Korean version user - cp949)

My Cuckoo version and operating system are:

Cuckoo Version : 2.0.6.2 Python Version : 2.7.15 Host : Windows 10 Pro (build 17314) Guest : Ubuntu 17.04 Desktop Machinery : VirtualBox 5.2.16

This can be reproduced by:
The log, error, files etc can be found at:
2018-08-04 20:59:26,844 [cuckoo.core.scheduler] INFO: Task #3: acquired machine cuckoo1 (label=cuckoo1)
2018-08-04 20:59:26,849 [cuckoo.auxiliary.sniffer] INFO: Started sniffer with PID 11964 (interface=\Device\NPF_{8F76A6F3-3C80-48B2-9514-2C1B3067D79B}, host=192.168.56.101)
2018-08-04 20:59:26,851 [cuckoo.core.plugins] DEBUG: Started auxiliary module: Sniffer
2018-08-04 20:59:26,901 [cuckoo.machinery.virtualbox] DEBUG: Starting vm cuckoo1
2018-08-04 20:59:27,391 [cuckoo.machinery.virtualbox] DEBUG: Restoring virtual machine cuckoo1 to snapshot1
2018-08-04 20:59:37,255 [cuckoo.core.guest] INFO: Starting analysis on guest (id=cuckoo1, ip=192.168.56.101)
2018-08-04 20:59:38,259 [cuckoo.core.guest] DEBUG: cuckoo1: not ready yet
2018-08-04 20:59:39,265 [cuckoo.core.guest] DEBUG: cuckoo1: not ready yet
2018-08-04 20:59:40,269 [cuckoo.core.guest] DEBUG: cuckoo1: not ready yet
2018-08-04 20:59:40,326 [cuckoo.core.guest] INFO: Guest is running Cuckoo Agent 0.8 (id=cuckoo1, ip=192.168.56.101)
2018-08-04 20:59:40,345 [cuckoo.core.guest] DEBUG: Uploading analyzer to guest (id=cuckoo1, ip=192.168.56.101, monitor=latest, size=30538)
2018-08-04 20:59:40,421 [cuckoo.core.guest] DEBUG: cuckoo1: analysis still processing
2018-08-04 20:59:40,546 [cuckoo.core.resultserver] DEBUG: LogHandler for live analysis.log initialized.
2018-08-04 20:59:41,428 [cuckoo.core.guest] DEBUG: cuckoo1: analysis still processing
2018-08-04 20:59:42,436 [cuckoo.core.guest] DEBUG: cuckoo1: analysis still processing
2018-08-04 20:59:43,446 [cuckoo.core.guest] DEBUG: cuckoo1: analysis still processing
2018-08-04 20:59:44,453 [cuckoo.core.guest] DEBUG: cuckoo1: analysis still processing
2018-08-04 20:59:45,463 [cuckoo.core.guest] DEBUG: cuckoo1: analysis still processing
2018-08-04 20:59:45,713 [cuckoo.core.resultserver] DEBUG: File upload request for logs/all.stap
2018-08-04 20:59:45,720 [cuckoo.core.resultserver] DEBUG: Uploaded file length: 8441
2018-08-04 20:59:46,470 [cuckoo.core.guest] INFO: cuckoo1: analysis completed successfully
2018-08-04 20:59:46,499 [cuckoo.core.plugins] DEBUG: Stopped auxiliary module: Sniffer
2018-08-04 20:59:46,500 [cuckoo.machinery.virtualbox] DEBUG: Stopping vm cuckoo1
2018-08-04 20:59:47,724 [cuckoo.core.scheduler] DEBUG: Released database task #3
2018-08-04 20:59:47,767 [cuckoo.core.plugins] DEBUG: Executed processing module "AnalysisInfo" for task #3
2018-08-04 20:59:47,779 [cuckoo.core.plugins] DEBUG: Executed processing module "BehaviorAnalysis" for task #3
2018-08-04 20:59:47,780 [cuckoo.core.plugins] DEBUG: Executed processing module "Dropped" for task #3
2018-08-04 20:59:47,782 [cuckoo.core.plugins] DEBUG: Executed processing module "DroppedBuffer" for task #3
2018-08-04 20:59:47,793 [cuckoo.core.plugins] DEBUG: Executed processing module "MetaInfo" for task #3
2018-08-04 20:59:47,796 [cuckoo.core.plugins] DEBUG: Executed processing module "ProcessMemory" for task #3
2018-08-04 20:59:47,796 [cuckoo.core.plugins] DEBUG: Executed processing module "Procmon" for task #3
2018-08-04 20:59:47,798 [cuckoo.core.plugins] DEBUG: Executed processing module "Screenshots" for task #3
2018-08-04 20:59:47,812 [cuckoo.core.plugins] DEBUG: Executed processing module "Static" for task #3
2018-08-04 20:59:47,819 [cuckoo.core.plugins] DEBUG: Executed processing module "Strings" for task #3
2018-08-04 20:59:47,838 [cuckoo.core.plugins] DEBUG: Executed processing module "TargetInfo" for task #3
2018-08-04 20:59:47,862 [cuckoo.core.plugins] DEBUG: Executed processing module "NetworkAnalysis" for task #3
2018-08-04 20:59:47,865 [cuckoo.core.plugins] DEBUG: Executed processing module "Extracted" for task #3
2018-08-04 20:59:47,865 [cuckoo.core.plugins] DEBUG: Executed processing module "TLSMasterSecrets" for task #3
2018-08-04 20:59:47,871 [cuckoo.core.plugins] DEBUG: Executed processing module "Debug" for task #3
2018-08-04 20:59:47,871 [cuckoo.core.plugins] DEBUG: Running 0 signatures
2018-08-04 20:59:47,895 [cuckoo.core.plugins] DEBUG: Executed reporting module "JsonDump"
in2etv commented 6 years ago

I modified the strings.py a little bit. Now, it works well.

I opened the file in binary mode and used the regex pattern as a byte array. Using this approach, it works with both Python2 and Python3. (Both Windows and Linux have been tested.)

Here is the modified strings.py :

# Copyright (C) 2012-2013 Claudio Guarnieri.
# Copyright (C) 2014-2017 Cuckoo Foundation.
# This file is part of Cuckoo Sandbox - http://www.cuckoosandbox.org
# See the file 'docs/LICENSE' for copying permission.

import os.path
import re

from cuckoo.common.abstracts import Processing
from cuckoo.common.exceptions import CuckooProcessingError

class Strings(Processing):
    """Extract strings from analyzed file."""
    MAX_FILESIZE = 16*1024*1024
    MAX_STRINGCNT = 2048
    MAX_STRINGLEN = 1024

    def run(self):
        """Run extract of printable strings.
        @return: list of printable strings.
        """
        self.key = "strings"
        strings = []

        if self.task["category"] == "file":
            if not os.path.exists(self.file_path):
                raise CuckooProcessingError(
                    "Sample file doesn't exist: \"%s\"" % self.file_path
                )

            try:
                # Modified
                data = open(self.file_path, "rb").read(self.MAX_FILESIZE)
                #
            except (IOError, OSError) as e:
                raise CuckooProcessingError("Error opening file %s" % e)

            # Modified
            strings = []
            for s in re.findall(b"[\x1f-\x7e]{6,}", data):
                strings.append(s.decode("utf-8"))
            for s in re.findall(b"(?:[\x1f-\x7e][\x00]){6,}", data):
                strings.append(s.decode("utf-16le"))
            #

        # Now limit the amount & length of the strings.
        strings = strings[:self.MAX_STRINGCNT]
        for idx, s in enumerate(strings):
            strings[idx] = s[:self.MAX_STRINGLEN]

        return strings
doomedraven commented 6 years ago

create pull request with update then ;)