cisagov / Malcolm

Malcolm is a powerful, easily deployable network traffic analysis tool suite for full packet capture artifacts (PCAP files), Zeek logs and Suricata alerts.
https://cisagov.github.io/Malcolm/
Other
1.97k stars 331 forks source link

automated testing #486

Open mmguero opened 2 weeks ago

mmguero commented 2 weeks ago

@mmguero cloned issue idaholab/Malcolm#11 on 2020-09-09:

Original issue contents:

Currently Malcolm's tested manually by me on a per-change basis. As the project matures, I need to look into implementing some kind of test framework that can be run overnight or something to ensure builds and functionality don't break without my knowing it.

@Dmanzella spent some of his internship laying the groundwork for an automated testing framework. We need to build on what he did (either directly or using is as lessons-learned or as a blueprint) to deliver on this. To me, this is more of an "automated system testing" framework than an "automated unit testing" framework, as it's difficult to pull individual components of Malcolm out and test them atomically. It requires a whole Malcolm instance to be up and running.

Here are requirements as I see them at the moment:

  • Needs to be able to at a minimum run locally on a Linux system
    • other systems would be nice to have (e.g., Kubernetes, Windows, or MacOS, etc.)
  • For reproducibility/simplicity I think having Malcolm run in a Virtual Machine is the best thing, but I could be convinced otherwise
  • Using other tools like ansible or something like that either below or under a VM deployment tool like vagrant is fine with me too, if that makes it easier
  • A test should exist as a single directory containing
    • a reference to the data to populate Malcolm with for the test (e.g., a PCAP or windows evtx file)
      • this may or may not be the actual PCAP: maybe it's a URL, or some other external reference?
    • some sort of definition of a query to run (I think using the Malcolm API wherever possible is the way to be most consistent, vs. accessing opensearch directly
      • other APIs like opensearch, netbox, and the Arkime viewer API may also be used, see the links at the bottom here
    • a known good file that is compared against the results (diff means test failed, no diff means test passed)
  • the automated testing framework should
    • start up a Malcolm instance from scratch, using scripting only (non-interactive)
    • be able to execute 1..n of the tests as specified by the user (a single test, some subset of tests, all tests, etc.)
    • wait for malcolm to be started and fully ready to receive PCAP
    • wait for all artifacts to be fully processed before running queries
    • once Malcolm has started, process test directories for the tests specified and "upload" all of the necessary artifacts
      • each artifact must be tagged with some unique value that is tied to the test name itself, so that when running that test's queries ONLY the data with that tag are considered for the results
    • report on results of tests (success/failure, provide access to the diff in case of failures)
    • completely destroy and remove the Malcolm instance when specified tests are complete

This repository will exist as idaholab/Malcolm-Test.