SigmaHQ / sigma

Main Sigma Rule Repository
Other
8.28k stars 2.19k forks source link

[es-dsl] Case-insensitive search query #844

Closed tatsuiman closed 2 years ago

tatsuiman commented 4 years ago

How can I create a search rule that ignores the case when searching for a string such as the following?

Does Elasticsearch need any special settings?

detection:
    keywords:
        Message:
            - "*Invoke-DllInjection*"
            - "*Invoke-Shellcode*"
            - "*Invoke-WmiCommand*"
neu5ron commented 4 years ago

can you see if this may help? https://github.com/Neo23x0/sigma/blob/master/tools/README.md#choosing-the-right-sigmac

what version of elastic are you using? are you using beats (winlogbeat in this scenario) and what modules enabled or index templates loaded? The doc above may be better to follow - if it's not and or you have any feedback, please let me know as we would love to update it with your feedback in consideration.

tatsuiman commented 4 years ago

elasticsearch is using v 7.4.2. Mapping uses its own definition, and sigmac will convert it to the following query

{

    "query": {
        "constant_score": {
            "filter": {
                "bool": {
                    "should": [
                        {
                            "wildcard": {
                                "evtx.eventdata.data.message.keyword": "*Invoke-DllInjection*"
                            }
                        },
                        {
                            "wildcard": {
                                "evtx.eventdata.data.message.keyword": "*Invoke-Shellcode*"
                            }
                        },
                        {
                            "wildcard": {
                                "evtx.eventdata.data.message.keyword": "*Invoke-WmiCommand*"
                            }
                        }
                    ]
                }
            }
        }
    }
}

The following documents are stored in Elasticsarch

{
  "_index" : "target-index",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 2,
  "_seq_no" : 1,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "evtx" : {
      "eventdata" : {
        "data" : {
          "message" : "invoke-dllInjection -ProcessID 4274 -Dll evil.dll"
        }
      }
    }
  }
}
neu5ron commented 4 years ago

ok, last question I think I should need - what is the mapping for this field? is evtx.eventdata.data.message a keyword? text? one of the prior but with a subfield ?

You could run this command in kibana dev tools or via CLI, just replace target-index with one of your indexes that contain this field in question. then send the mapping for that field curl -XGET "http://127.0.0.1:9200/target-index/_mapping"

for example, I run this on say a winlogbeat index then I look for (choosing a random field as an example) winlog.event_data.TargetImage I would do: curl -XGET http://127.0.0.1:9200/winlogbeat-7.2.0-2019.06.25-000001/_mapping then look for that field, and my results are:

"TargetImage" : {
  "type" : "keyword"
},
tatsuiman commented 4 years ago

The mapping for evtx.eventdata.data.message looks like this

"message" : {
  "type" : "text",
  "fields" : {
    "keyword": {
      "type": "keyword", 
      "ignore_above": 256
    }
  },
},
0xballistics commented 4 years ago

I am struggling with the same problem myself.

I can confirm using Winlogbeat 7.8.1 with default configuration and Elasticsearch 7.8, the message field has the same mappings.

In order to deal with case sensitivity the rule can be compiled with --backend-option case_insensitive_whitelist="*" but it does not work in this case. It is because your query looks for the keyword mapping but if text field is larger than 256 characters, it is not mapped as a keyword.

You can add --backend-option keyword_field="" to eliminate usage of "keyword" at rule generation but this won't work either because your query contains hyphen characters and they are not indexed by the default analyzer when the field type is "text".

AFAIK, when you install Winlogbeat, it loads the index template if it does not exist and gets the field mappings from "fields.yml" file in installation directory by default.

Message field mapping in "fields.yml":

- name: message
  level: core
  type: text
  description: 'For log events the message field contains the log message, optimized

So, maybe changing fields.yml to map the message field type as "keyword" can solve the problem but I am not sure if it's the best solution or an ugly workaround. Perhaps there is a way to change the default "ignore_above" value of field mapping because I know it is not enforced by "fields.yml".

If I am correct, many of the rules are currently not working with the default Winlogbeat installation because of this indexing problem.

0xballistics commented 4 years ago

Hi,

Can anybody else validate the issue? Perhaps there is a simple solution I am unable to see here.

neu5ron commented 4 years ago

@0xballistics & @tatsu-i

there are a few things going on here.

These issues are documented, submitted, and so on - continuously by myself and a few others over the past 3+ years. Even it has been discussed in ECS. https://github.com/elastic/ecs/issues/105 I went into depth with a co-worker on the short comings of elastic strings - https://socprime.com/blog/elastic-for-security-analysts-part-1-searching-strings/

there are a couple things:

frack113 commented 2 years ago

Sorry this post is closed automatically because it is not more active