fox-it / dissect.target

The Dissect module tying all other Dissect modules together. It provides a programming API and command line tools which allow easy access to various data sources inside disk images or file collections (a.k.a. targets).
GNU Affero General Public License v3.0
41 stars 44 forks source link

Auth Log format update on rfc5424 #286

Open damasch opened 1 year ago

damasch commented 1 year ago
  1. Preamble

    The basic format for syslog messages in the "/var/log/auth.log" file follows the standard syslog format.

    <Timestamp> <Hostname> <Service>: <Message>

    The following pattern applies specifically to the auth.log file.

    <Timestamp> <Hostname> <Service>[<process_id>]: <Message>

    See: https://datatracker.ietf.org/doc/html/rfc5424#section-6

    1. Test data

      Mar 29 10:43:01 ip-10-77-20-248 sshd[1193]: Accepted password for test_user from REDACTED IP port 52942 ssh2
      Mar 27 13:06:56 ip-10-77-20-248 sshd[1291]: Server listening on REDACTED IP port 22.
      Mar 27 13:06:56 ip-10-77-20-248 systemd-logind[1118]: Watching system buttons on /dev/input/event0 (Power Button)
      Mar 27 13:06:56 ip-10-77-20-248 systemd-logind[1118]: Watching system buttons on /dev/input/event1 (Sleep Button)
      Mar 27 13:06:56 ip-10-77-20-248 systemd-logind[1118]: New seat seat0.
      Mar 27 13:08:09 ip-10-77-20-248 sshd[1361]: Accepted publickey for test_user from REDACTED IP port 12345 ssh2: RSA SHA256:123456789asdfghjklöertzuio
      Mar 27 13:08:09 ip-10-77-20-248 sshd[1361]: pam_unix(sshd:session): session opened for user test_user by (uid=0)
      Mar 27 13:08:09 ip-10-77-20-248 systemd: pam_unix(systemd-user:session): session opened for user test_user by (uid=0)
      Mar 27 13:10:08 ip-10-77-20-248 sudo:   ubuntu : TTY=pts/0 ; PWD=/home/test_user ; USER=root ; COMMAND=/usr/bin/apt-key add -
      Mar 27 13:17:01 ip-10-77-20-248 CRON[2623]: pam_unix(cron:session): session closed for user root
      Mar 27 13:23:11 ip-10-77-20-248 sudo: pam_unix(sudo:session): session opened for user root by test_user(uid=0)
  2. Problem with the current config

    The config of the regex pattern is declared in:

    https://github.com/fox-it/dissect.target/blob/main/dissect/target/plugins/os/unix/log/auth.py#L18

    But if acquire or target runs. The jsonl export deliver a pattern/object with:

  3. Possible Solution

    Try this pattern to the auth.log file and you will get the service and the process seperated from the message.

    ( w{3} s+ d+ s+ d{2}: d{2}: d{2}) s+( S+) s+([0-9A-Za-z -])? [?( w)? ]?: s+(.+)

    devided by groups

    To access the data use:

    _AUTH_LOG_REGEX = r"^(\w{3}\s+\d+\s+\d{2}:\d{2}:\d{2})\s+(\S+)\s+([0-9A-Za-z\-]**)?\[?(\w**)?\]?:\s+(.+)"
    
    RE_ALRE = re.compile(_AUTH_LOG_REGEX)
    ...
       auth_log_object = re.search(RE_ALRE, line)
       if auth_log_object:
         print(auth_log_object[1])
         print(auth_log_object[2])
         print(auth_log_object[3])
         print(auth_log_object[4])
  4. Upgrade for the code

    Try this example and use named subgroups for the regex.

    
    import re
    
    auth_log_content = """
    Mar 29 10:43:01 my_unix_host sshd[1193]: Accepted password for test_user from REDACTED IP port 52942 ssh2
    Mar 27 13:06:56 my_unix_host sshd[1291]: Server listening on REDACTED IP port 22.
    Mar 27 13:06:56 my_unix_host systemd-logind[1118]: Watching system buttons on /dev/input/event0 (Power Button)
    Mar 27 13:06:56 my_unix_host systemd-logind[1118]: Watching system buttons on /dev/input/event1 (Sleep Button)
    Mar 27 13:06:56 my_unix_host systemd-logind[1118]: New seat seat0.
    Mar 27 13:08:09 my_unix_host sshd[1361]: Accepted publickey for test_user from REDACTED IP port 12345 ssh2: RSA SHA256:123456789asdfghjklöertzuio
    Mar 27 13:08:09 my_unix_host sshd[1361]: pam_unix(sshd:session): session opened for user test_user by (uid=0)
    Mar 27 13:08:09 my_unix_host systemd: pam_unix(systemd-user:session): session opened for user test_user by (uid=0)
    Mar 27 13:10:08 my_unix_host sudo:   ubuntu : TTY=pts/0 ; PWD=/home/test_user ; USER=root ; COMMAND=/usr/bin/apt-key add -
    Mar 27 13:17:01 my_unix_host CRON[2623]: pam_unix(cron:session): session closed for user root
    Mar 27 13:23:11 my_unix_host sudo: pam_unix(sudo:session): session opened for user root by test_user(uid=0)
    """
    
    _TS_REGEX = r"^[A-Za-z]{3}\s\*[0-9]{1,2}\s[0-9]{1,2}:[0-9]{2}:[0-9]{2}"
    RE_TS = re.compile(_TS_REGEX)
    RE_TS_AND_HOSTNAME = re.compile(_TS_REGEX + r"\s\S+\s")
    
    for line in auth_log_content.split("\n"):
     ts_and_hostname = re.search(RE_TS_AND_HOSTNAME, line)
     if ts_and_hostname:
       message = line.replace(ts_and_hostname.group(0), "").strip()
       print(ts_and_hostname.group(0), message)
    
    _AUTH_LOG_REGEX = r"^(\w{3}\s+\d+\s+\d{2}:\d{2}:\d{2})\s+(\S+)\s+([0-9A-Za-z\-]**)?\[?(\w**)?\]?:\s+(.+)"
    
    RE_ALRE = re.compile(_AUTH_LOG_REGEX)
    
    for line in auth_log_content.split("\n"):
    
     auth_log_object = re.search(RE_ALRE, line)
    
     if auth_log_object:
       print()
       print(line)
       print(auth_log_object[1])
       print(auth_log_object[2])
       print(auth_log_object[3])
       print(auth_log_object[4])
       print(auth_log_object[5])
    
    _AUTH_LOG_NAMED_REGEX = r"^(?P<timestamp>\w{3}\s+\d+\s+\d{2}:\d{2}:\d{2})\s+(?P<hostname>\S+)\s+(?P<service>[0-9A-Za-z\-]**)?\[?(?P<processid>\w**)?\]?:\s+(?P<message>.+)"
    
    RE_ALNRE = re.compile(_AUTH_LOG_NAMED_REGEX)
    
    for line in auth_log_content.split("\n"):
    
     auth_log_object = re.match(RE_ALNRE, line)
    
     if auth_log_object:
       details = auth_log_object.groupdict()
       print(details)

    Output

    {'timestamp': 'Mar 29 10:43:01', 'hostname': 'my_unix_host', 'service': 'sshd', 'processid': '1193', 'message': 'Accepted password for test_user from REDACTED IP port 52942 ssh2'}
    {'timestamp': 'Mar 27 13:06:56', 'hostname': 'my_unix_host', 'service': 'sshd', 'processid': '1291', 'message': 'Server listening on REDACTED IP port 22.'}
    {'timestamp': 'Mar 27 13:06:56', 'hostname': 'my_unix_host', 'service': 'systemd-logind', 'processid': '1118', 'message': 'Watching system buttons on /dev/input/event0 (Power Button)'}
    {'timestamp': 'Mar 27 13:06:56', 'hostname': 'my_unix_host', 'service': 'systemd-logind', 'processid': '1118', 'message': 'Watching system buttons on /dev/input/event1 (Sleep Button)'}
    {'timestamp': 'Mar 27 13:06:56', 'hostname': 'my_unix_host', 'service': 'systemd-logind', 'processid': '1118', 'message': 'New seat seat0.'}
    {'timestamp': 'Mar 27 13:08:09', 'hostname': 'my_unix_host', 'service': 'sshd', 'processid': '1361', 'message': 'Accepted publickey for test_user from REDACTED IP port 12345 ssh2: RSA SHA256:123456789asdfghjklöertzuio'}
    {'timestamp': 'Mar 27 13:08:09', 'hostname': 'my_unix_host', 'service': 'sshd', 'processid': '1361', 'message': 'pam_unix(sshd:session): session opened for user test_user by (uid=0)'}
    {'timestamp': 'Mar 27 13:08:09', 'hostname': 'my_unix_host', 'service': 'systemd', 'processid': '', 'message': 'pam_unix(systemd-user:session): session opened for user test_user by (uid=0)'}
    {'timestamp': 'Mar 27 13:10:08', 'hostname': 'my_unix_host', 'service': 'sudo', 'processid': '', 'message': 'ubuntu : TTY=pts/0 ; PWD=/home/test_user ; USER=root ; COMMAND=/usr/bin/apt-key add -'}
    {'timestamp': 'Mar 27 13:17:01', 'hostname': 'my_unix_host', 'service': 'CRON', 'processid': '2623', 'message': 'pam_unix(cron:session): session closed for user root'}
    {'timestamp': 'Mar 27 13:23:11', 'hostname': 'my_unix_host', 'service': 'sudo', 'processid': '', 'message': 'pam_unix(sudo:session): session opened for user root by test_user(uid=0)'}

    So I prefer the named subgroups. But this is maybe not necessary for the mapping in the yield part.

Schamper commented 1 year ago

Thank you for the extensive description and suggestion! Sounds sensible to me.

@JSCU-CNI since you initially wrote the auth.log parser, perhaps you can comment on if you agree?

JSCU-CNI commented 1 year ago

This seems like a good improvement of the auth.log parser indeed :+1:

Schamper commented 1 year ago

Great! @damasch would you like to create a PR for this yourself?

damasch commented 1 year ago

@Schamper Yes but how? I'm not a contributer to create a branch? Or should I fork this project?

I write a patch for this plugin. I'ts a suggestion:

import re
from itertools import chain
from typing import Iterator

from dissect.target.helpers.record import TargetRecordDescriptor
from dissect.target.helpers.utils import year_rollover_helper
from dissect.target.plugin import Plugin, export

AuthLogRecord = TargetRecordDescriptor(
    "linux/log/auth",
    [
        ("datetime", "ts"),
        ("string", "service"),
        ("string", "processid"),
        ("string", "message"),
        ("string", "user"),
        ("string", "remoteip"),
        ("string", "port"),
        ("string", "authservice"),
        ("string", "protocol"),
        ("string", "encryption"),
        ("string", "method"),
        ("string", "key"),
        ("string", "misc"),
        ("string", "sessionmode"),
        ("string", "userid"),
        ("string", "useridassociate"),
        ("string", "tty"),
        ("string", "pwd"),
        ("string", "usereffective"),
        ("string", "command"),
        ("string", "cron"),
        ("string", "conectionmode"),
        ("path", "source"),
    ],
)

# Timestamp Regex
_TS_REGEX = r"^[A-Za-z]{3}\s*[0-9]{1,2}\s[0-9]{1,2}:[0-9]{2}:[0-9]{2}"
RE_TS = re.compile(_TS_REGEX)
RE_TS_AND_HOSTNAME = re.compile(_TS_REGEX + r"\s\S+\s")

# New regex pattern for getting the log entries
# Mar 29 10:43:01 my_unix_host ...
RE_ENTRY = re.compile(r'^\w{3}\s+\d+\s+\d{2}:\d{2}:\d{2}\s+\S+\s+(?P<service>[0-9A-Za-z\-\-\(\)=]*)?\[?(?P<processid>\w*)?\]?:\s+(?P<message>.+)')

# New regex pattern for interpreting the SSH message "Accepted password"
# Mar 29 10:43:01 my_unix_host sshd[1193]: Accepted password for test_user from 127.0.0.1 port 52942 ssh2
RE_SSH_ACCEPTED_PASSWORD = re.compile(r'^Accepted\spassword\sfor\s(?P<user>[\S\_]+)\sfrom\s(?P<remoteip>[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}|[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4})\sport\s(?P<port>[0-9]{1,5})\s(?P<authservice>\w+)$')

# New regex pattern for interpreting the SSH message "Accepted publickey"
# Accepted publickey for test_user from 123.123.123.123 port 12345 ssh2: RSA SHA256:123456789asdfghjklöertzuio
RE_SSH_ACCEPTED_PUBLICKEY = re.compile(r'^Accepted\spublickey\sfor\s(?P<user>\S+)\sfrom\s(?P<remoteip>[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}|[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4})\sport\s(?P<port>[0-9]{1,5})\s(?P<protocol>\w+):\s(?P<encryption>[\w\-]+)\s(?P<method>[\w]+):(?P<key>[\w:]+)(?P<misc>.*)$')

# New regex pattern for interpreting the PAM UNIX message "ssh"
# Jul  5 13:20:15 test-VirtualBox sudo: pam_unix(sudo:session): session opened for user root(uid=0) by test(uid=0)
RE_SSH_PAM_UNIX = re.compile(r'pam_unix\((?P<authservice>sshd):.*\): +session +(?P<sessionmode>closed|opened)\sfor\suser\s(?P<user>\w+)(?:\(uid=(?P<useridassociate>\w+)\))?(?:\sby\s)?(?:\(uid=(?P<userid>\w+)\))?$')

# New regex pattern for interpreting the connection mode on close message "ssh"
# Mar 29 17:07:19 my_unix_host sshd[4649]: Connection closed by 85.245.107.41 port 54790 [preauth]
RE_SSH_CONNECTION = re.compile(r'^Connection\s(?P<conectionmode>closed)\sby\s(?P<remoteip>[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}|[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4})\sport\s(?P<port>[0-9]{1,5})$(?P<misc>.*)$')

# New regex pattern for interpreting the command from message "sudo"
# Jul  4 17:10:18 my_unix_host sudo:     test : TTY=pts/0 ; PWD=/home/test ; USER=root ; COMMAND=/usr/bin/apt update
RE_SUDO_COMMAND = re.compile(r'^(?P<user>\w+)\s:\sTTY=(?P<tty>\w+\/\w+)\s;\sPWD=(?P<pwd>[\/\w]+)\s;\sUSER=(?P<usereffective>\w+)\s;\sCOMMAND=(?P<command>.+)$')

# New regex pattern for interpreting the session message "cron"
# Mar 29 17:07:25 my_unix_host sshd[4651]: pam_unix(sshd:session): session opened for user ubuntu by (uid=0)
RE_CRON_PAM_UNIX = re.compile(r'^pam_unix\(cron:(?P<cron>.*)\): +session +(?P<sessionmode>closed|opened)\sfor\suser\s(?P<user>\w+)(?:\(uid=(?P<useridassociate>\w+)\))?(?:\sby\s)?(?:\(uid=(?P<userid>\w+)\))?$')

class AuthPlugin(Plugin):
    def check_compatible(self) -> bool:
        var_log = self.target.fs.path("/var/log")
        return any(var_log.glob("auth.log*")) or any(var_log.glob("secure*"))

    def apply_regex_on_message(self, pattern, messsage):
        """Return a data object with the data from the group dict regex"""
        data = {}

        message_data = re.match(pattern, messsage)
        if message_data:
            data.update(message_data.groupdict())

        return data

    @export(record=[AuthLogRecord])
    def securelog(self) -> Iterator[AuthLogRecord]:
        """Return contents of /var/log/auth.log* and /var/log/secure*."""
        return self.authlog()

    @export(record=[AuthLogRecord])
    def authlog(self) -> Iterator[AuthLogRecord]:
        """Return contents of /var/log/auth.log* and /var/log/secure*."""

        # Assuming no custom date_format template is set in syslog-ng or systemd (M d H:M:S)
        # CentOS format: Jan 12 13:37:00 hostname daemon: message
        # Debian format: Jan 12 13:37:00 hostname daemon[pid]: pam_unix(daemon:session): message

        tzinfo = self.target.datetime.tzinfo

        var_log = self.target.fs.path("/var/log")
        for auth_file in chain(var_log.glob("auth.log*"), var_log.glob("secure*")):
            for ts, line in year_rollover_helper(auth_file, RE_TS, "%b %d %H:%M:%S", tzinfo):
                data = {
                    'ts': ts,
                }

                log_entry = re.match(RE_ENTRY, line)
                if not log_entry:
                    self.target.log.warning("Log entry does not match with pattern %s.", auth_file)
                    self.target.log.warning("Skipping this line: %s", line)
                    continue

                data.update(log_entry.groupdict())

                data.update(self.apply_regex_on_message(RE_SSH_ACCEPTED_PASSWORD, log_entry["message"]))
                data.update(self.apply_regex_on_message(RE_SSH_ACCEPTED_PUBLICKEY, log_entry["message"]))
                data.update(self.apply_regex_on_message(RE_SSH_PAM_UNIX, log_entry["message"]))
                data.update(self.apply_regex_on_message(RE_SSH_CONNECTION, log_entry["message"]))
                data.update(self.apply_regex_on_message(RE_SUDO_COMMAND, log_entry["message"]))
                data.update(self.apply_regex_on_message(RE_CRON_PAM_UNIX, log_entry["message"]))

                data.update({
                    'source': auth_file,
                    '_target': self.target
                })
                yield AuthLogRecord(
                    **data
                    )
Schamper commented 1 year ago

You can fork the project, commit/push to your own fork and create a PR to this project. Doing it that way also makes sure you get properly credited for your contribution.

If you would like us to make these changes instead, just say the word :smile:

JSCU-CNI commented 1 week ago

@Schamper is your team (planning on) working on this?

EinatFox commented 1 week ago

Hi @JSCU-CNI yes we do. As of now, it is planned to go in Dissect release 3.17.

JSCU-CNI commented 1 week ago

Thanks for the update :)