CERT-Polska / mquery

YARA malware query accelerator (web frontend)
GNU Affero General Public License v3.0
413 stars 77 forks source link

Advise the user that the Yara query they're about to execute is inefficient #308

Closed ITAYC0HEN closed 1 year ago

ITAYC0HEN commented 1 year ago

Feature Category

Describe the problem

Currently, people who don't understand how mquery and mwdb work behind the scene can execute very bad yara queries and it'll be (a) expensive (b) inefficient (c) take forever to run.

Describe the solution you'd like

Advise the user that the Yara query they're about to execute is inefficient. We can warn them and ask "Are you sure you want to proceed?" or we can decide to prevent inefficient yara altogether.

VT has a similar approach on their retrohunt.

msm-code commented 1 year ago

100% agree and I was thinking of something like this (slightly related issue: #297)

The question, of course, is: how do we decide if the query is inefficient. For example is query:

rule msm {
    strings: $msm = "msm"
    condition: all of them

inefficient? It will return many results, but all of them will actually match the rule.

The common pitfall with mquery is, I think, running a query that (accidentaly or not) returns all files in the dataset. For example (and this one is a bit tricky):

rule APT17_Malware_Oct17_Gen {
   meta:
      description = "Detects APT17 malware"
      license = "Detection Rule License 1.1 https://github.com/Neo23x0/signature-base/blob/master/LICENSE"
      author = "Florian Roth"
      reference = "https://goo.gl/puVc9q"
      date = "2017-10-03"
      hash1 = "0375b4216334c85a4b29441a3d37e61d7797c2e1cb94b14cf6292449fb25c7b2"
      hash2 = "07f93e49c7015b68e2542fc591ad2b4a1bc01349f79d48db67c53938ad4b525d"
      hash3 = "ee362a8161bd442073775363bf5fa1305abac2ce39b903d63df0d7121ba60550"
   strings:
      $x1 = "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NETCLR 2.0.50727)" fullword ascii
      $x2 = "http://%s/imgres?q=A380&hl=en-US&sa=X&biw=1440&bih=809&tbm=isus&tbnid=aLW4-J8Q1lmYBM" ascii

      $s1 = "hWritePipe2 Error:%d" fullword ascii
      $s2 = "Not Support This Function!" fullword ascii
      $s3 = "Cookie: SESSIONID=%s" fullword ascii
      $s4 = "http://0.0.0.0/1" fullword ascii
      $s5 = "Content-Type: image/x-png" fullword ascii
      $s6 = "Accept-Language: en-US" fullword ascii
      $s7 = "IISCMD Error:%d" fullword ascii
      $s8 = "[IISEND=0x%08X][Recv:] 0x%08X %s" fullword ascii
   condition:
      ( uint16(0) == 0x5a4d and filesize < 200KB and (
            pe.imphash() == "414bbd566b700ea021cfae3ad8f4d9b9" or
            1 of ($x*) or
            6 of them
         )
      )
}

This is overall a good rule, but the unsupported condition pe.imphash() == "414bbd566b700ea021cfae3ad8f4d9b9" forces the mquery to run yara on everything.

So I think we should start with warning the user when the query they're about to execute is so bad it can't be improved by ursadb in any way.

Or do you think we should block more classes of bad yara rules? Ideally give an example if you have it.