ocsf / ocsf-schema

OCSF Schema
Apache License 2.0
633 stars 136 forks source link

Activity class required to represent script execution events #1156

Open davemcatcisco opened 3 months ago

davemcatcisco commented 3 months ago

TL;DR - The schema needs a new activity class to represent script execution events.

Most Windows EDR products provide visibility into the execution of PowerShell, Python, VBScript, JavaScript, Office macros, etc. Win10 and later provides an API that security products can use to obtain synchronous notification when content is executed by a supported scripting engine. In the case of PowerShell, asynchronous notification via ETW is also supported. I'm not a macOS guy but my understanding is that EDRs on that platform also provide telemetry on execution of shell scripts, Python, Node, etc.

In the interests of clarity, I'd like to head off possible confusion that may arise from a narrower understanding of the term "script execution". That narrower understanding is based on the plain vanilla case where an interpreter (powershell, python, bash, wscript, cmd, etc.) starts, executes a script in a file, and then exits. Here script execution aligns with the lifetime of the interpreter process, and one might therefore think that script excecution could be represented by adding attributes to the Process Activity class. However, script execution in general doesn't align with process lifetime.

In much the same way that a process might create thousands of files, modify thousands of registry keys, or make thousands of network connections during its lifetime, so might a process execute thousands of scripts. So just as we have unique activity classes to represent all of these other things that a process can do, so should we have an activity class for when it executes a script.

Let me give a few examples to clarify what I mean:

All of these script execution cases are potentially observed by an EDR, and should therefore be representable by an activity event.

I propose to address this issue in a forthcoming PR which will add a Script Activity class.

jonrau-at-queryai commented 3 months ago

Would you also be adding a script object with this?

I definitely support this idea, the Microsoft DeviceEvents hunting table and Sentinel/Defender XDR table in Log Analytics captures this sort of data with ActionType = ScriptContent. Similar data can be gleaned from Crowdstrike Falcon Data Replicator and LogScale, as well as DeepVisibility (or whatever SentinelOne calls it now) in S1 Singularity.

And for some color for other's, here's an example of that from a Windows honeypot running on AWS

  {
    "TenantId": "273b5c16----2997e05fc9fb",
    "AccountDomain": "",
    "AccountName": "",
    "AccountSid": "",
    "ActionType": "ScriptContent",
    "AdditionalFields": {
      "ScriptContent": "# sudo python3 open_files.py --ScriptName open_files.py --id log4j_handlersV2 --filter-env LOG4J_FORMAT_MSG_NO_LOOKUPS=true --filter-name \"log4j,LOG4J,spring-core\" --filter-command \"java,javaw\" --manifest-path \"META-INF/maven/org.apache.logging.log4j/log4j-core/pom.properties\" --marker-path /var/opt/microsoft/mdatp/wdavedr/log4jMitigationApplied --collect-dirlist /log4j/core/lookup/JndiLookup.class,log4j-,spring-core-\n# sudo python2 open_files.py --ScriptName open_files.py --id log4j_handlersV2 --filter-env LOG4J_FORMAT_MSG_NO_LOOKUPS=true --filter-name \"log4j,LOG4J,spring-core\" --filter-command \"java,javaw\" --manifest-path \"META-INF/maven/org.apache.logging.log4j/log4j-core/pom.properties\" --marker-path /var/opt/microsoft/mdatp/wdavedr/log4jMitigationApplied --collect-dirlist /log4j/core/lookup/JndiLookup.class,log4j-,spring-core-\n# sudo rm /opt/microsoft/mdatp/resources/cache/log4j_handlersV2.json \n\nfrom genericpath import isdir\nimport os\nimport re\nimport sys\nimport json\nfrom datetime import datetime as dt\nimport zipfile\nimport string\nimport argparse\nimport traceback\nimport functools\nimport itertools\nimport subprocess as sb\n\nMAX_FILE_SIZE = 1024 * 1024  # 1MB\nMANIFEST_OLD_PATH = \"META-INF/MANIFEST.MF\"\n\ndef take(n, l):\n    for i, item in enumerate(l):\n        if i > n:\n            break\n        yield item\n\nclass Jar:\n    def __init__(self, path):\n        self.path = path\n        self._manifest = {}\n        self._dirlist = []\n\n    def _parse_manifest(self, lines):\n        version_indication = \"version=\"\n        version_lines = [line for line in lines if line.startswith(version_indication)]\n\n        if len(version_lines) > 0:\n            version = version_lines[0][len(version_indication):]\n            yield 'Version', version.strip()\n\n        field_names = ['Specification-Version', 'Specification-Title', 'Specification-Vendor', 'Implementation-Version', 'Implementation-Title', 'Implementation-Vendor']\n        for line in lines:\n            if any(line.startswith(field_name) for field_name in field_names):\n                    key, value = line.split(':')\n                    yield key.strip(), value.strip()\n\n    def _open(self):\n        if not zipfile.is_zipfile(self.path):\n            raise ValueError(\"path is not a zip file: {}\".format(self.path))\n        return zipfile.ZipFile(self.path)\n\n    def _read_dirlist(self):\n        with self._open() as zf:\n            filenames = dict(p for p in zf.namelist())\n            return [f for f in filenames if any(r.search(f.lower()) for r in args.dirlist)]\n\n\n\n    def _get_manifest_path(self, zf):\n        for path in [args.manifest_path, MANIFEST_OLD_PATH]:\n            if path in zf.namelist():\n                return path\n\n    def _read_manifest(self, throw_on_error=False):\n        try:\n            with self._open() as zf:\n                manifest_path = self._get_manifest_path(zf)\n                if not manifest_path:\n                    # Not found manifest file\n                    return {}\n\n                manifest_info = zf.getinfo(manifest_path)\n                if manifest_info.file_size > MAX_FILE_SIZE:\n                    raise IOError(\"manifest file is too big\")\n\n                with zf.open(manifest_path) as f:\n                    readline_f = functools.partial(f.readline, MAX_FILE_SIZE)\n                    manifest_lines = list(x.decode().strip() for x in iter(readline_f, b''))\n                    manifest = self._parse_manifest(manifest_lines)\n                    return dict((k, v) for k, v in manifest\n                            if not args.manifest_keys or any(m.search(k.lower()) for m in args.manifest_keys))\n        except:\n            sys.stderr.write(\"error while reading manifest of '{}': {}\\n\".format(self.path, traceback.format_exc()))\n\n            if throw_on_error:\n                raise\n\n            return {}\n\n    def manifest(self, throw_on_error=False):\n        if not self._manifest:\n            self._manifest = self._read_manifest(throw_on_error)\n        return self._"
    },
    "AppGuardContainerId": "",
    "DeviceId": "aaaaa",
    "DeviceName": "ip-172-31-5-30.us-east-2.compute.internal",
    "FileName": "",
    "FileOriginIP": "",
    "FileOriginUrl": "",
    "FolderPath": "",
    "InitiatingProcessAccountDomain": "",
    "InitiatingProcessAccountName": "",
    "InitiatingProcessAccountObjectId": "",
    "InitiatingProcessAccountSid": "",
    "InitiatingProcessAccountUpn": "",
    "InitiatingProcessCommandLine": "",
    "InitiatingProcessFileName": "",
    "InitiatingProcessFolderPath": "",
    "InitiatingProcessId": 629100,
    "InitiatingProcessLogonId": 0,
    "InitiatingProcessMD5": "",
    "InitiatingProcessParentFileName": "",
    "InitiatingProcessParentId": 0,
    "InitiatingProcessSHA1": "",
    "InitiatingProcessSHA256": "",
    "LocalIP": "",
    "LocalPort": "",
    "LogonId": "",
    "MD5": "",
    "MachineGroup": "",
    "ProcessCommandLine": "",
    "ProcessId": "",
    "ProcessTokenElevation": "",
    "RegistryKey": "",
    "RegistryValueData": "",
    "RegistryValueName": "",
    "RemoteDeviceName": "",
    "RemoteIP": "",
    "RemotePort": "",
    "RemoteUrl": "",
    "ReportId": 828379,
    "SHA1": "",
    "SHA256": "df4742a00d9f68ad9e665357ef9bb5a8c37cc21975368270cceb7fbd6dc27ec5",
    "Timestamp [UTC]": "8/13/2024, 9:08:12.257 PM",
    "TimeGenerated [UTC]": "8/13/2024, 9:08:12.257 PM",
    "FileSize": "",
    "InitiatingProcessCreationTime [UTC]": "8/13/2024, 9:08:12.204 PM",
    "InitiatingProcessFileSize": "",
    "InitiatingProcessParentCreationTime [UTC]": "",
    "InitiatingProcessVersionInfoCompanyName": "",
    "InitiatingProcessVersionInfoFileDescription": "",
    "InitiatingProcessVersionInfoInternalFileName": "",
    "InitiatingProcessVersionInfoOriginalFileName": "",
    "InitiatingProcessVersionInfoProductName": "",
    "InitiatingProcessVersionInfoProductVersion": "",
    "ProcessCreationTime [UTC]": "",
    "SourceSystem": "",
    "Type": "DeviceEvents"
  },
  {
    "TenantId": "273b5c16----2997e05fc9fb",
    "AccountDomain": "",
    "AccountName": "",
    "AccountSid": "",
    "ActionType": "ScriptContent",
    "AdditionalFields": {
      "ScriptContent": "#!/usr/bin/bash\n#\n# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\"). You may\n# not use this file except in compliance with the License. A copy of the\n# License is located at\n#\n#      http://aws.amazon.com/apache2.0/\n#\n# or in the \"license\" file accompanying this file. This file is distributed\n# on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either\n# express or implied. See the License for the specific language governing\n# permissions and limitations under the License.\n\nset -eo pipefail -o noclobber -o nounset\n\ndeclare -r runtimeroot=\"/run/amazon-ec2-net-utils\"\ndeclare -r lockdir=\"${runtimeroot}/setup-policy-routes\"\ndeclare -r unitdir=\"/run/systemd/network\"\ndeclare -r reload_flag=\"${runtimeroot}/.policy-routes-reload-networkd\"\n\nlibdir=${LIBDIR_OVERRIDE:-/usr/share/amazon-ec2-net-utils}\n# shellcheck source=../lib/lib.sh\n. \"${libdir}/lib.sh\"\n\niface=\"$1\"\n[ -n \"$iface\" ] || { error \"Invocation error\"; exit 1; }\n\nmkdir -p \"**********\n\ncase \"$2\" in\nstop)\n    register_networkd_reloader\n    info \"Stopping $iface.\"\n    rm -rf \"/run/network/$iface\" \\\n       \"${unitdir}/70-${iface}.network\" \\\n       \"${unitdir}/70-${iface}.network.d\" || true\n    touch \"$reload_flag\"\n    ;;\nstart)\n    register_networkd_reloader\n    while [ ! -e \"/sys/class/net/${iface}\" ]; do\n        debug  \"Waiting for sysfs node to exist\"\n        sleep 0.1\n    done\n    info \"Starting configuration for $iface\"\n    debug /lib/systemd/systemd-networkd-wait-online -i \"$iface\"\n    /lib/systemd/systemd-networkd-wait-online -i \"$iface\"\n    ether=$(cat /sys/class/net/${iface}/address)\n\n    declare -i changes=0\n    changes+=$(setup_interface $iface $ether)\n    if [ $changes -gt 0 ]; then\n        touch \"$reload_flag\"\n    fi\n    ;;\ncleanup)\n    if [ -e \"${lockdir}/${iface}\" ]; then\n        info \"WARNING: Cleaning up leaked lock ${lockdir}/${iface}\"\n        rm -f \"${lockdir}/${iface}\"\n    fi\n    ;;\n*)\n    echo \"USAGE: $0: start|stop\"\n    echo \"  This tool is normally invoked via udev rules.\"\n    echo \"  See https://github.com/amazonlinux/amazon-ec2-net-utils\"\n    ;;\nesac\n\nexit 0\n"
    },
    "AppGuardContainerId": "",
    "DeviceId": "aaaaaa",
    "DeviceName": "ip-172-31-5-30.us-east-2.compute.internal",
    "FileName": "",
    "FileOriginIP": "",
    "FileOriginUrl": "",
    "FolderPath": "",
    "InitiatingProcessAccountDomain": "",
    "InitiatingProcessAccountName": "",
    "InitiatingProcessAccountObjectId": "",
    "InitiatingProcessAccountSid": "",
    "InitiatingProcessAccountUpn": "",
    "InitiatingProcessCommandLine": "",
    "InitiatingProcessFileName": "",
    "InitiatingProcessFolderPath": "",
    "InitiatingProcessId": 584578,
    "InitiatingProcessLogonId": 0,
    "InitiatingProcessMD5": "",
    "InitiatingProcessParentFileName": "",
    "InitiatingProcessParentId": 0,
    "InitiatingProcessSHA1": "",
    "InitiatingProcessSHA256": "",
    "LocalIP": "",
    "LocalPort": "",
    "LogonId": "",
    "MD5": "",
    "MachineGroup": "",
    "ProcessCommandLine": "",
    "ProcessId": "",
    "ProcessTokenElevation": "",
    "RegistryKey": "",
    "RegistryValueData": "",
    "RegistryValueName": "",
    "RemoteDeviceName": "",
    "RemoteIP": "",
    "RemotePort": "",
    "RemoteUrl": "",
    "ReportId": 820161,
    "SHA1": "",
    "SHA256": "98e4e3ec0ff5702d7cc65d798abb4e14cb419a58968a5693746bc00de9df34e3",
    "Timestamp [UTC]": "8/12/2024, 9:35:57.863 PM",
    "TimeGenerated [UTC]": "8/12/2024, 9:35:57.863 PM",
    "FileSize": "",
    "InitiatingProcessCreationTime [UTC]": "8/12/2024, 9:35:57.809 PM",
    "InitiatingProcessFileSize": "",
    "InitiatingProcessParentCreationTime [UTC]": "",
    "InitiatingProcessVersionInfoCompanyName": "",
    "InitiatingProcessVersionInfoFileDescription": "",
    "InitiatingProcessVersionInfoInternalFileName": "",
    "InitiatingProcessVersionInfoOriginalFileName": "",
    "InitiatingProcessVersionInfoProductName": "",
    "InitiatingProcessVersionInfoProductVersion": "",
    "ProcessCreationTime [UTC]": "",
    "SourceSystem": "",
    "Type": "DeviceEvents"
  }
]
davemcatcisco commented 3 months ago

@jonrau-at-queryai - Yes, I will create a script object to hold all the script-related stuff. This will enable scripts to be referenced in other parts of the schema too, e.g. in the evidences array of a Detection Finding event. The Script Activity that I propose above will essentially just extend System Activity by adding a script attribute.

mikeradka commented 3 months ago

For what it is worth, there is some script activity that we've successfully been able to translate to 'process activity', since a script runs as a process - namely, Powershell 4104 and 4103 events. The way we've been able to achieve this is by mapping these as process start events with the script information in the process.cmd_line field. If need be, i can share some examples of how these are done.