Support for regex named capture groups

joshnck commented 5 months ago

There are times where you need to compare two substrings that you pull apart using Regex and in Splunk you can easily use named capture groups. Consider the following query:

index IN (your, ad, logs) EventCode=39
| rex field=Subject "@@@CN=(?<subj_user>[^\.]+)"
| rex field=AccountName "(?<name>[^$]+)"
| eval subj_user=upper(subj_user)
| where subj_user!=name
| table _time AccountName Subject subj_user name

If this were doable in Sigma, the rule should look like:

title: Certificate Request on Behalf of Another User
id: ec8633a2-d0cf-49c7-92a5-410c0528a6dc
status: test
description: |
    Detects ESC1 escalation path in ADCS environment where an attacker has a compromised account
    and requests a certificate on behalf of another account, likely a domain admin, via a vulnerable
    certifiate template
author: Josh Nickels, Tomasz Dyduch, Marius Rothenbuecher, Balazs Lendvay
date: 2024/04/29
logsource:
    product: windows
    service: system
detection:
    selection_provider:
        Provider_Name: 'Microsoft-Windows-Kerberos-Key-Distribution-Center'
    selection_event:
        eventID: 39
    selection_subject:
        Subject|re: '@@@CN=(?<subj_user>[^\.]+)'
    selection_name:
        AccountName|re: '(?<name>[^$]+)'
    selection_match:
        subj_user|fieldref: name
    condition: selection_provider and selection_event and selection_match
falsepositives:
    - Unknown
level: low

Unfortunately, named capture groups are not handled in PySigma to create a new field like they are in Splunk.

joshnck commented 5 months ago

The logic in this sigma rule is flawed - but I hope my point comes across. Let me know if you'd like me to refactor this rule to make more sense if I am unclear.

thomaspatzke commented 4 months ago

The logic and use case are clear. There are some reasons I'm a bit hesitant on this. It could be implemented by additional field extractions configured in the SIEM. Defining fields in Sigma was already subject of discussions in the past and the outcome was to keep this out of Sigma, as it adds lots of complexity to it and while the topics are coupled together, Sigma main focus lies on detection than field extraction.

The syntax you've pointed out has the issue that there's a difference between regex matching and field extraction. E.g. in Splunk this are different commands, regex and rex. Until now we mostly got around the issue to "understand" the regular expression while the conversion. It's basically passed through to the query with some minor escaping. Adding this to Sigma would require to determine if the regular expression contains field extractions and to adapt the query accordingly. It also challenging for portability reasons.

I'm unsure if this is the right way to add this logic to the detection rules or if a separate rule type could be more appropriate. This would separate detection logic from log semantics and enable the conversion process to generate extraction configs or at least give the user a choice for where the extraction should be done.

SigmaHQ / pySigma

Support for regex named capture groups #213