Malcolm is a powerful, easily deployable network traffic analysis tool suite for full packet capture artifacts (PCAP files), Zeek logs and Suricata alerts.
Currently Malcolm's tested manually by me on a per-change basis. As the project matures, I need to look into implementing some kind of test framework that can be run overnight or something to ensure builds and functionality don't break without my knowing it.
@Dmanzella spent some of his internship laying the groundwork for an automated testing framework. We need to build on what he did (either directly or using is as lessons-learned or as a blueprint) to deliver on this. To me, this is more of an "automated system testing" framework than an "automated unit testing" framework, as it's difficult to pull individual components of Malcolm out and test them atomically. It requires a whole Malcolm instance to be up and running.
Here are requirements as I see them at the moment:
Needs to be able to at a minimum run locally on a Linux system
other systems would be nice to have (e.g., Kubernetes, Windows, or MacOS, etc.)
For reproducibility/simplicity I think having Malcolm run in a Virtual Machine is the best thing, but I could be convinced otherwise
Using other tools like ansible or something like that either below or under a VM deployment tool like vagrant is fine with me too, if that makes it easier
A test should exist as a single directory containing
a reference to the data to populate Malcolm with for the test (e.g., a PCAP or windows evtx file)
this may or may not be the actual PCAP: maybe it's a URL, or some other external reference?
some sort of definition of a query to run (I think using the Malcolm API wherever possible is the way to be most consistent, vs. accessing opensearch directly
other APIs like opensearch, netbox, and the Arkime viewer API may also be used, see the links at the bottom here
a known good file that is compared against the results (diff means test failed, no diff means test passed)
the automated testing framework should
start up a Malcolm instance from scratch, using scripting only (non-interactive)
once Malcolm has started, process test directories for the tests specified and "upload" all of the necessary artifacts
each artifact must be tagged with some unique value that is tied to the test name itself, so that when running that test's queries ONLY the data with that tag are considered for the results
report on results of tests (success/failure, provide access to the diff in case of failures)
completely destroy and remove the Malcolm instance when specified tests are complete
@mmguero cloned issue idaholab/Malcolm#11 on 2020-09-09: