Closed imidoriya closed 1 year ago
Hi there, We are planning to release a MISP taxonomy and/or galaxy soon. In the meantime these are what we believe are the most related existing taxonomies:
CLASS --> ms-caro-malware:malware-type, malware-category:malware-category FILE --> ms-caro-malware:malware-platform FAM --> ms-caro-malware:malware-family, mwdb:family BEH --> maec-malware-behavior:maec-malware-behavior
If this addresses your question, please close the issue
Excellent - Thanks What do you expect the timeframe to be on the MISP taxonomy release? Might make sense for me to just wait, rather than trying to convert them later.
Wanted to follow up on my question above about the timeframe. Thanks
Hard to say as usual, but we hope to have a first version in three weeks
I am reopening this issue as we have just committed a first version of the AVClass2 MISP galaxy in the repo: avclass2/data/misp Pretty close to the three weeks timeframe :) It is work in progress. Please take a look and provide feedback / suggestions on what would make it more useful Once we have a stable version we could commit it to the misp-project repository
Thanks, I'll take a look.
Didn't have any issues. I'm still learning av tagging, so don't have feedback yet accept to say that it looks good.
I'm seeing some that I may need to add to the list or synonyms. Not sure how to do that.. but I see tags like
I also have tags, misp-galaxy:avclass="bladabi"
for example, that are not in the galaxy. In that case, it's a synonym, but others may be new. I'll need to understand how I can improve your galaxy based on my data. Thx
Here is an example of an update I tried. https://github.com/imidoriya/avclass/commit/18115f9db8ef33fd739e160aed348476ab8527b7 Is this all I need to do or is there more in avclass that I should be updating? One note for example is Crysis is ransomware, but wasn't sure if I am to specify that anywhere, such as doing related
... type=variant-of
. Is it ok for a Family to be a variant-of a Behavior or Class? I'm still parsing through some of my stuff, but I currently have 346 tags that are not listed in the galaxy cluster.
Update As I understand the structure more, I'm updating the taxonomy, alias, and the misp, along with the addition of related entities. https://github.com/malicialab/avclass/compare/master...imidoriya:master
@jeffg2k that question likely belongs in a separate thread, here is a quick response and if you want to further discuss just open another issue. AvClass2 internally uses a threshold that it only shows tags that appear in at least two AV engines, but where there are some groups of AV engines known to collaborate/copy their labels. In addition tags have their associated counter that captures the number of AV labels (after removing duplicate engines) that include the tag's concept in their AV label. Thus, applying a threshold as you mention should be trivial, although one could also store tag and threshold enabling the application of a threshold later one. I guess the question here is what kind of false indicators you may be seeing. As said, feel free to open another issue if you want to follow up.
@imidoriya, We typically differentiate between tags in the taxonomy and tokens (i.e., tags not in the taxonomy marked with UNK category if you use the -p option). Both can appear in the AVClass2 output. Tags are included in the MISP taxonomy, but tokens are not because they still have not been classified. If you want us to include some new alias you can open a new issue. For those that you mention I see "hpbladabi" which likely is an alias for "bladabindi" (0.83 rate and we use 0.94 as threshold by default), but I do not see "bladabi" in our data. For "crysis" and "crusis" I see a lower ratio of 0.63 that is why we have not added a tagging rule yet. Both get tagged as ransomware so they could be aliases indeed although I would wait a bit more to add that one.
@imidoriya , I checked your updates and the only one that convinces me (based on our data) are these aliases: crusis -> crysis dharma -> crysis I just added them to the taxonomy and tagging BTW, if you can share your .alias file (generated with the -aliasdetect option and which only contains cumulative counts of tokens) it makes it much easier for us to check for new aliases.
I am not familiar with MISP object relationships such as variant-of. We'll take a look and if someone else has feedback on how they use them, please speak up.
Gotcha, well I'm currently going through the UNK where I have hundreds of items tagged and trying to put them into the taxonomy. I'll create a PR and I guess you can let me know if they're ok or not, hopefully so as it takes a bit to research them and make the entries. lol I'm trying to also include ref links. I probably should be using your updater, but I'm just not sure how to use it against my data, which is already in MISP.
Gives you things like this... in this case, I actually need to fix it as it would probably be better stated as "subtechnique-of" for the remoteadmin.
Capturing parent-child relationships in the AVClass taxonomy int the MISP cluster file makes a lot of sense. Between "variant-of" and "subtechnique-of" I kind of like "variant-of" better though at least for CLASS entries.
For the alias, in the cases below, I was just using the alias defined via https://malpedia.caad.fkie.fraunhofer.de/. In the other cases, I did see the alias in my data. With regard to alias file, what would be the best way to create that as I have a lot of files and each one is scanned independently?
phobos dharma
arena dharma
wadhrama dharma
ncov dharma
geodo emotet
heodo emotet
These I identified the alias tags in my data and looked them up on malpedia, which led to the main classification. Also, it was easy to see if bladabi
or njrat
was tagged, most of the time bladabindi
was also tagged. So it was easy to see they were related and alternative names for the same thing, confirmed by malpedia.
bladabi bladabindi
njrat bladabindi
wacatac deathransom
fuerboos goodor
If you have many files for individual samples, you can put them in the same directory and use the -vtdir
Regarding Malpedia, we are currently analyzing the entries and aliases they have and AVClass not, but it may take still take us some time to finish the analysis.
I have several hundred thousand files and they're compressed and password protected. What I'm doing now is pulling an event from MISP, reading the AV results for uploads, and then feeding that to AvClass.
But since I have them tagged now, I can run avclass against any particular set of tags. So if we wanted to run all the files that were classified as deathransom
or goodor
, I can do that.
Not sure how best to clean this up, but it seems a bit excessive for MISP tagging. miner
, mining
, bitcoinminer
, bitcoinmining
. Might make sense to just list each once in the cluster with an alias. I would list the miner maybe as the BEH and the bitcoinminer as the CLASS and have the class reference the behavior.
These are some of the remaining UNK tags that are not in the cluster, which I wasn't sure about. For example, I think indiloadz is a type of adware but not sure if it's an alias for adware, a sub-technique of adware, or a family of adware. The number to the right is my count in descending order. |
Tag | Count |
---|---|---|
misp-galaxy:avclass="ursu" | 521 | |
misp-galaxy:avclass="disfa" | 427 | |
misp-galaxy:avclass="crysan" | 414 | |
misp-galaxy:avclass="indiloadz" | 412 | |
misp-galaxy:avclass="atraps" | 398 | |
misp-galaxy:avclass="dodiw" | 290 | |
misp-galaxy:avclass="brmon" | 273 | |
misp-galaxy:avclass="gorgon" | 238 | |
misp-galaxy:avclass="tasker" | 201 | |
misp-galaxy:avclass="bulz" | 200 | |
misp-galaxy:avclass="generickdz" | 188 | |
misp-galaxy:avclass="veil" | 175 | |
misp-galaxy:avclass="msilperseus" | 167 | |
misp-galaxy:avclass="subti" | 159 | |
misp-galaxy:avclass="bplug" | 141 | |
misp-galaxy:avclass="llac" | 139 | |
misp-galaxy:avclass="rrat" | 138 | |
misp-galaxy:avclass="rescoms" | 131 | |
misp-galaxy:avclass="agensla" | 123 | |
misp-galaxy:avclass="nanobot" | 122 | |
misp-galaxy:avclass="hotkeychick" | 121 | |
misp-galaxy:avclass="zenpak" | 107 | |
misp-galaxy:avclass="gencbl" | 105 | |
misp-galaxy:avclass="noancooe" | 104 | |
misp-galaxy:avclass="avemaria" | 88 | |
misp-galaxy:avclass="agentb" | 84 | |
misp-galaxy:avclass="chapak" | 78 | |
misp-galaxy:avclass="injuke" | 71 | |
misp-galaxy:avclass="emeka" | 70 | |
misp-galaxy:avclass="fsysna" | 70 | |
misp-galaxy:avclass="cobalt" | 69 | |
misp-galaxy:avclass="vebzenpak" | 68 | |
misp-galaxy:avclass="johnnie" | 62 | |
misp-galaxy:avclass="alien" | 59 | |
misp-galaxy:avclass="redcap" | 57 | |
misp-galaxy:avclass="solmyr" | 57 | |
misp-galaxy:avclass="xaparo" | 57 | |
misp-galaxy:avclass="bsymem" | 57 | |
misp-galaxy:avclass="genericgba" | 56 | |
misp-galaxy:avclass="obfdldr" | 54 | |
misp-galaxy:avclass="leivion" | 54 | |
misp-galaxy:avclass="downeks" | 53 | |
misp-galaxy:avclass="midie" | 50 | |
misp-galaxy:avclass="coins" | 48 | |
misp-galaxy:avclass="ruco" | 43 | |
misp-galaxy:avclass="pavica" | 43 | |
misp-galaxy:avclass="liev" | 42 | |
misp-galaxy:avclass="noobyprotect" | 42 | |
misp-galaxy:avclass="vatet" | 42 | |
misp-galaxy:avclass="noon" | 40 | |
misp-galaxy:avclass="mansabo" | 40 | |
misp-galaxy:avclass="starter" | 38 | |
misp-galaxy:avclass="bobik" | 37 | |
misp-galaxy:avclass="scrop" | 36 | |
misp-galaxy:avclass="packer" | 36 | |
misp-galaxy:avclass="boxedapp" | 36 | |
misp-galaxy:avclass="zapchast" | 36 | |
misp-galaxy:avclass="hesv" | 35 | |
misp-galaxy:avclass="dothetuk" | 35 | |
misp-galaxy:avclass="cometer" | 35 | |
misp-galaxy:avclass="pyxie" | 35 | |
misp-galaxy:avclass="lampa" | 34 | |
misp-galaxy:avclass="injects" | 32 | |
misp-galaxy:avclass="coroxy" | 31 | |
misp-galaxy:avclass="predator" | 31 | |
misp-galaxy:avclass="jacksbot" | 30 |
Just wanted to say that I've been running this galaxy PR https://github.com/malicialab/avclass/pull/34 for the past month and it's working out pretty well. I still have quite a few that could use categorization as seen above, even if I were to add them as UNK. It would be helpful for how they are displayed in MISP as anything that is not defined is presented as a normal tag, even if you have the galaxy prefix. Thoughts?
We just committed a script misp.py to keep the MISP taxonomy updated. Only difference should be that it does not add the Malpedia URLs for now. The version has been bumped to match that of the tool, so that it is easy to know which version of the avclass taxonomy it comes from.
Since this issue has been around for quite a while, I am going to close it. Feel free to open another issue if you spot anything.
I'm looking to use avclass2 to classify malware in my MISP instance. After getting a result from avclass, I'd like to give it an appropriate tag. Just wondering if you had a recommendation for tag taxonomies that would best align with the return of avclass.
Here are the MISP taxonomies
The ones that look like they might fit.
ms-caro-malware-full
Malware Type and Platform classification based on Microsoft's implementation of the Computer Antivirus Research Organization (CARO) Naming Scheme and Malware Terminology. Based on https://www.microsoft.com/en-us/security/portal/mmpc/shared/malwarenaming.aspx, https://www.microsoft.com/security/portal/mmpc/shared/glossary.aspx, https://www.microsoft.com/security/portal/mmpc/shared/objectivecriteria.aspx, and http://www.caro.org/definitions/index.html. Malware families are extracted from Microsoft SIRs since 2008 based on https://www.microsoft.com/security/sir/archive/default.aspx and https://www.microsoft.com/en-us/security/portal/threat/threats.aspx. Note that SIRs do NOT include all Microsoft malware families.
mwdb
Malware Database (mwdb) Taxonomy - Tags used across the platform
malware_classification
Classification based on different categories. Based on https://www.sans.org/reading-room/whitepapers/incident/malware-101-viruses-32848