VirusTotal / yara

The pattern matching swiss knife
https://virustotal.github.io/yara/
BSD 3-Clause "New" or "Revised" License
8.08k stars 1.43k forks source link

Improve Tlsh library in Yara #1962

Open dmknght opened 11 months ago

dmknght commented 11 months ago

Is your feature request related to a problem? Please describe. The current Tlsh code is the TLSH-C, which was ported by Avast dev. This version is missing 2 functions

  1. Calculate the diff score of 2 Tlsh hashes (https://github.com/avast/tlshc/issues/1)
  2. Load the hash value from a string (https://github.com/avast/tlshc/issues/2) Therefore, Tlsh version that Yara is using can do hash matching only. IMO tlsh is a fuzzyhash algorithm so it should has score diffing

Describe the solution you'd like In https://github.com/VirusTotal/yara/blob/master/libyara/tlshc/tlsh.c, Yara can add


int tlsh_total_diff(Tlsh* tlsh, Tlsh *other, bool len_diff)
{
    return tlsh_impl_total_diff(tlsh->impl, other->impl, len_diff);
}

int tlsh_from_tlsh_str(Tlsh* tlsh, const char *str)
{
    return tlsh_impl_from_tlsh_str(tlsh->impl, str);
}

And then add a function in ELF module to compare hash (maybe create an other function to compare with diff_len)

bool yr_elf_tlsh_cmp(char *hash, int user_score)
{
    Tlsh tlsh = new_tlsh();
    tlsh_from_tlsh_str(tlsh, hash);
   // Code block to get hash from metadata here
   if (user_score < tlsh_total_diff(tlsh, <elf_tlsh_struct>, False))
       return True;
   return False;
}