Malcolm is a powerful, easily deployable network traffic analysis tool suite for full packet capture artifacts (PCAP files), Zeek logs and Suricata alerts.
It's possible that in a very high-volume traffic environment, the amount of scanned files Zeek sees could outpace the system's ability to scan them. This could cause the storage holding the extracted files to fill to 100% which could cause all sorts of system problems. So this could technically be an avenue to DoS the system if it fills the disk up.
There are a few issues we need to figure out:
Allow setting some kind of high-water mark for extracted files (some % of the disk full, or some amount of free space available)
If this limit is hit, then what do we do? If we disable file extraction, then potentially at that point malicious files could creep in without being detected. However, if we don't disable file extraction, the disk could fill up. Perhaps we could switch to a "not quite emergency but almost" mode where we're scanning files and logging the results but not saving the files themselves, which I suppose is better than nothing, but only kicks the can down the road. At the end of the day I think we'd have to disable file scanning while we are above this threshold, but then log periodic notices about it or something until it gets under control.
Currently, Malcolm doesn't do anything to prune extracted files. We do have this sort of a feature for pruning old PCAP files (this is handled by Arkime) and/or OpenSearch indices (handled by opensearch_index_size_prune.py). We need to provide the ability to configure a limit and then delete the oldest extracted files at some point. I suppose we could start with the preserved state files, then the quarantined state files afterwards. The shared/bin/prune_files.sh script used by Hedgehog could probably be used here.
Whatever we do here needs to be well-documented in the documentation, and also we may need to hook back around into the malcolm alerting API (this may work as is, or may require some modifications itself) to create logs to notify when this is happening.
It's possible that in a very high-volume traffic environment, the amount of scanned files Zeek sees could outpace the system's ability to scan them. This could cause the storage holding the extracted files to fill to 100% which could cause all sorts of system problems. So this could technically be an avenue to DoS the system if it fills the disk up.
There are a few issues we need to figure out:
opensearch_index_size_prune.py
). We need to provide the ability to configure a limit and then delete the oldest extracted files at some point. I suppose we could start with thepreserved
state files, then thequarantined
state files afterwards. Theshared/bin/prune_files.sh
script used by Hedgehog could probably be used here.Whatever we do here needs to be well-documented in the documentation, and also we may need to hook back around into the malcolm alerting API (this may work as is, or may require some modifications itself) to create logs to notify when this is happening.