elastic / ember

Elastic Malware Benchmark for Empowering Researchers
Other
948 stars 277 forks source link

How can I use .features script to extract features from a malware sample I already have using the same way ember does? #111

Closed leenaut closed 1 year ago

leenaut commented 1 year ago

Hello

I'm trying to extract features from malware samples I already have using ember (features.py) Please let me know If my approach is correct, my main concern is I'm not sure how I should pass in the malware to the functions provided by ember.

import ember
extractor = ember.PEFeatureExtractor(2)
extractor.feature_vector("/Downloads/samples/0d9e5116c1da200fa3a55c84ca2195eb7bbbd1e1")

After doing this I get the following error

**543 except Exception: # everything else (KeyboardInterrupt, SystemExit, ValueError): 544 raise --> 546 features = {"sha256": hashlib.sha256(bytez).hexdigest()} 547 features.update({fe.name: fe.raw_features(bytez, lief_binary) for fe in self.features}) 548 return features

TypeError: Strings must be encoded before hashing**

Could someone explain how to use the code to extract features from malware files I have, and what exactly to pass in and in what format?

Thank you