Neo23x0 / yarGen

yarGen is a generator for YARA rules
Other
1.55k stars 282 forks source link

Overall rules for Malware Families? #47

Open Babyhamsta opened 10 months ago

Babyhamsta commented 10 months ago

I noticed when I scanned 14 exe's all of the same malware family it outputted a rule for each EXE and none of the detections were that similar. Is there a way to create an overall rule based on the matching opcodes/strings between the large amount of exes so the full malware family is detected instead of having specific ones for each exe?

This may just be my ignorance on using the script. It seems inefficient to not do a single rule for the malware family. Both of these exe's are part of the RacoonStealer Family.

Example from two of the generated rules:

rule sig_1f9bd27fd7591a98afd67499ae6730eb56c137335d283892bc06b7ab2241ed6c {
   meta:
      description = "malware - file 1f9bd27fd7591a98afd67499ae6730eb56c137335d283892bc06b7ab2241ed6c.exe"
      author = "Babyhamsta"
      reference = "https://github.com/Neo23x0/yarGen"
      date = "2023-12-11"
      hash1 = "1f9bd27fd7591a98afd67499ae6730eb56c137335d283892bc06b7ab2241ed6c"
   strings:
      $s1 = "HGDI32.dll" fullword ascii
      $s2 = "ACDSeeQVUltimate15.exe.dll" fullword wide
      $s3 = "         <requestedExecutionLevel level='asInvoker' uiAccess='false'/>" fullword ascii
      $s4 = "nhjnjK:\"V" fullword ascii
      $s5 = "\\QHdll_." fullword ascii
      $s6 = "* }5Q^" fullword ascii
      $s7 = "* _m3`Q" fullword ascii
      $s8 = "TaCe+ l" fullword ascii
      $s9 = "wqC.Qot" fullword ascii
      $s10 = "1Wd.BEy" fullword ascii
      $s11 = "F:\"0V1" fullword ascii
      $s12 = "7Xpq.ssc" fullword ascii
      $s13 = "6aF.OCB}cR" fullword ascii
      $s14 = "rct3s:\\" fullword ascii
      $s15 = "/L:\"l?'Y" fullword ascii
      $s16 = " R:\"*b" fullword ascii
      $s17 = "zo:\"jhhs`" fullword ascii
      $s18 = "]SE:\\e" fullword ascii
      $s19 = "AAT:\\," fullword ascii
      $s20 = "D:\\ X{" fullword ascii

      $op0 = { 8b 34 b5 20 4e f6 00 66 3b fb f8 66 85 ed c1 e8 }
      $op1 = { 66 c1 e2 f5 c1 ca 02 8d b6 ff ff ff ff f7 c7 d7 }
      $op2 = { 89 01 8d bf fc ff ff ff 66 8b d7 c0 da 5a 8b 17 }
   condition:
      uint16(0) == 0x5a4d and
      ( 8 of them and all of ($op*) )
}

rule sig_6b7bb7ed7e486cdc4e3a1d67a598aeee5a74e3c58f94e48e5fa626d6562f8688 {
   meta:
      description = "malware - file 6b7bb7ed7e486cdc4e3a1d67a598aeee5a74e3c58f94e48e5fa626d6562f8688.exe"
      author = "Babyhamsta"
      reference = "https://github.com/Neo23x0/yarGen"
      date = "2023-12-11"
      hash1 = "6b7bb7ed7e486cdc4e3a1d67a598aeee5a74e3c58f94e48e5fa626d6562f8688"
   strings:
      $s1 = "BladesGray.exe" fullword wide
      $s2 = "gunshot.exe" fullword wide
      $s3 = "Kozatipepikici laci. Canogoz pupaho. Jil xofiroj xokur xisidukuy. Sesecigo bipaxuh nuvu. Roladig. Gayoyir mil. Daxafoxa mik. Lez" ascii
      $s4 = "22222222222222222222222226" ascii /* hex encoded string '""""""""""""&' */
      $s5 = "222222222222222222222C" ascii /* hex encoded string '"""""""""",' */
      $s6 = "2222222222222222222222222222222222222222222222222222222222222222222222222222222222222222" ascii /* hex encoded string '""""""""""""""""""""""""""""""""""""""""""""' */
      $s7 = "22222222222222222222222222222222222222" ascii /* hex encoded string '"""""""""""""""""""' */
      $s8 = "222222222C" ascii /* hex encoded string '"""",' */
      $s9 = "222222222222222222222222222222222222222222" ascii /* hex encoded string '"""""""""""""""""""""' */
      $s10 = "22222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222" ascii /* hex encoded string '""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""&' */
      $s11 = "2222222222222222222222222222222222222222222222" ascii /* hex encoded string '"""""""""""""""""""""""' */
      $s12 = "62222222222222222222222222" ascii /* hex encoded string 'b""""""""""""' */
      $s13 = "222222222222222222222222222222222222" ascii /* hex encoded string '""""""""""""""""""' */
      $s14 = "22222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222" ascii /* hex encoded string '""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""' */
      $s15 = "22222222222222222222222222222222222222222222222222222222222222222222222226" ascii /* hex encoded string '""""""""""""""""""""""""""""""""""""&' */
      $s16 = "6222222222222222222222" ascii /* hex encoded string 'b""""""""""' */
      $s17 = "4.1.61.50" fullword wide /* hex encoded string 'AaP' */
      $s18 = "ijeye. Minucivuxiyupor kafihizeyokocu piyit. Temezoto zebenuxokeyosop. Patikinehunalo hej fexu. Piv zegosob deti jovisodidicoyam" ascii
      $s19 = "zeposureyutazajimevusayunu xetoneya beyeziyumosalasagasilumepiz cevikotevokoyufajozebupoge" fullword wide
      $s20 = "FileDescriptions" fullword wide

      $op0 = { 83 ff ff ff db ff ff ff 35 }
      $op1 = { eb 04 83 65 e0 00 8b 45 e0 e8 96 34 00 00 c3 83 }
      $op2 = { 33 c0 8b 4d fc 5f 5e 33 cd 5b e8 34 3c 00 00 c9 }
   condition:
      uint16(0) == 0x5a4d and
      ( 8 of them and all of ($op*) )
}
ruppde commented 10 months ago

You might want to experiment with -w superrule-overlap

Babyhamsta commented 10 months ago

You might want to experiment with -w superrule-overlap

I tried with -w 1, I just thought it was weird that they didn't group more. If I did some checking by hand with IDA and some plugins I bet I could find some matching opcodes/strings between them all.

I did notice it generated a super rule and it worked okay but wasn't super consistent for new samples or varied samples. It may be that I didn't have enough data but I used 35 samples.

ruppde commented 10 months ago

did you check, if some of the samples don't fit to the group because they're a total different version, architecture, ... ?

maybe try with 10, which are about the same size or imphash?

Babyhamsta commented 10 months ago

did you check, if some of the samples don't fit to the group because they're a total different version, architecture, ... ?

maybe try with 10, which are about the same size or imphash?

I guess they could be packed differently. I know some were C# while others were single run C++ programs.