VirusTotal / vt-py

The official Python 3 client library for VirusTotal
https://virustotal.github.io/vt-py/
Apache License 2.0
531 stars 121 forks source link

Extracting Family Labels From Binaries #163

Open lkurlandski opened 11 months ago

lkurlandski commented 11 months ago

Hi,

I would like to use the VT API to label malware binaries by their family or threat category. When I upload a binary (eba1c664d265f2bbe4b4dfb466fad46c36727c63ad2ef25bc5942cd0235a0c63) to VT's web interface (you can access the same web view as me here), I get the following information:

Popular threat label: trojan.gafgyt/ddos

Threat categories: trojan

Family labels: gafgyt ddos lightaidra

I would really like to get this same information from the API but I am having trouble with it. Using the VirusTotal API, I can write a script like so

from vt import Client

KEY = "{MY_API_KEY}"
FILE = "VirusShare_00000bd55c61bc0e490d8410db131303.elf"

client = Client(KEY)
with open(FILE, "rb") as fp:
    analysis = client.scan_file(fp, wait_for_completion=True)
results = {vendor : report["result"] for vendor, report in analysis.results.items()}
pprint(results)

Which gives me a pretty informative report for each vendor

output {'ALYac': 'Trojan.Generic.32499050', 'APEX': None, 'AVG': 'ELF:Mirai-ANY [Trj]', 'Acronis': None, 'AhnLab-V3': 'Linux/Mirai.Gen3', 'Alibaba': None, 'Antiy-AVL': 'Trojan[Backdoor]/Linux.Mirai.ba', 'Arcabit': 'Trojan.Generic.D1EFE56A', 'Avast': 'ELF:Mirai-ANY [Trj]', 'Avast-Mobile': 'ELF:Mirai-ANY [Trj]', 'Avira': 'LINUX/Mirai.udhim', 'Baidu': None, 'BitDefender': 'Trojan.Generic.32499050', 'BitDefenderFalx': None, 'BitDefenderTheta': 'Gen:NN.Mirai.36722', 'Bkav': None, 'CAT-QuickHeal': None, 'CMC': None, 'ClamAV': 'Unix.Malware.Agent-6831890-0', 'CrowdStrike': None, 'Cybereason': None, 'Cylance': None, 'Cynet': 'Malicious (score: 99)', 'Cyren': 'E32/Mirai.U.gen!Camelot', 'DeepInstinct': None, 'DrWeb': 'Linux.Siggen.9999', 'ESET-NOD32': 'a variant of Linux/Mirai.AT', 'Elastic': None, 'Emsisoft': 'Trojan.Generic.32499050 (B)', 'F-Secure': 'Malware.LINUX/Mirai.udhim', 'FireEye': 'Trojan.Generic.32499050', 'Fortinet': 'ELF/Mirai.AT!tr', 'GData': 'Linux.Trojan.Mirai.J', 'Google': 'Detected', 'Gridinsoft': None, 'Ikarus': 'Trojan.Linux.Mirai', 'Jiangmin': 'Backdoor.Linux.asxm', 'K7AntiVirus': None, 'K7GW': None, 'Kaspersky': 'HEUR:Backdoor.Linux.Mirai.ba', 'Lionic': 'Trojan.Linux.Mirai.K!c', 'MAX': 'malware (ai score=99)', 'Malwarebytes': None, 'MaxSecure': None, 'McAfee': 'Linux/mirai.d', 'McAfee-GW-Edition': 'Linux/mirai.d', 'MicroWorld-eScan': 'Trojan.Generic.32499050', 'Microsoft': 'Trojan:Linux/Mirai', 'NANO-Antivirus': 'Trojan.ElfArm32.Mirai.fmibkd', 'Paloalto': None, 'Panda': None, 'Rising': 'Backdoor.Mirai/Linux!1.BC48 (CLASSIC)', 'SUPERAntiSpyware': None, 'Sangfor': 'Suspicious.Linux.Save.a', 'SentinelOne': None, 'Sophos': 'Mal/Generic-S', 'Symantec': 'Linux.Mirai', 'SymantecMobileInsight': None, 'TACHYON': None, 'Tencent': None, 'Trapmine': None, 'TrendMicro': 'Trojan.Linux.MIRAI.SMNM1', 'TrendMicro-HouseCall': 'Trojan.Linux.MIRAI.SMNM1', 'Trustlook': None, 'VBA32': None, 'VIPRE': 'Trojan.Generic.32499050', 'ViRobot': None, 'VirIT': 'Linux.Mirai.CG', 'Webroot': None, 'Xcitium': 'Malware@#1ojx4ch0uqrwj', 'Yandex': None, 'Zillya': 'Backdoor.Mirai.Linux.48138', 'ZoneAlarm': 'HEUR:Backdoor.Linux.Mirai.ba', 'Zoner': None, 'tehtris': None}

The problem is, that I don't know the processing rules by which VT takes this detailed report and condenses it down into those three family labels. If I had a comprehensive list of all the families VT classifies malware as I could just search for substrings, but I can't seem to find such a list anywhere. Ideally, I would just be able to access the same information provided on the web interface directly.

Any help greatly appreciated!