Closed mmguero closed 5 months ago
tangentially related to #441
I think this is complete. Here's what I've done (you can ignore the one commit in the middle of there where I bumped beats/logstash)
prune_files.sh
to allow specifying a gigabytes threshold in addition to a max fill line percentageprune_files.sh
to file-monitor.Dockerfile and an entry for it into file-monitor's supervisord conf and added default values for these environment variables in zeek.env
:
EXTRACTED_FILE_PRUNE_THRESHOLD_MAX_SIZE
- specifies the maximum size, specified either in gigabytes or as a human-readable data size (e.g., 250G
), that the ./zeek-logs/extract_files/
directory is allowed to contain before the prune condition triggersEXTRACTED_FILE_PRUNE_THRESHOLD_TOTAL_DISK_USAGE_PERCENT
- specifies a maximum fill percentage for the file system containing the ./zeek-logs/extract_files/
; in other words, if the disk is more than this percentage utilized, the prune condition triggersEXTRACTED_FILE_PRUNE_INTERVAL_SECONDS
- the interval between checking the prune conditions, in seconds (default 300
)install.py
to prompt for the new environment variable thresholds specified aboveAnd i have tested file-monitor's new behavior:
$ dc exec -u $(id -u) file-monitor bash
monitor@file-monitor:/zeek/extract_files$ env|grep PRUNE
EXTRACTED_FILE_PRUNE_INTERVAL_SECONDS=60
EXTRACTED_FILE_PRUNE_THRESHOLD_MAX_SIZE=250G
EXTRACTED_FILE_PRUNE_THRESHOLD_TOTAL_DISK_USAGE_PERCENT=0
$ for SEQ in $(seq 1 300); do fallocate -l 1000000000 $SEQ.file; done
$ du -sh
280G .
file-monitor-1 | Pruned 50 files (47GiB) in "/zeek/extract_files"
$ du -sh
233G .
The files that are preserved from file carving (whether
EXTRACTED_FILE_PRESERVATION
isall
orquarantined
) are never deleted, at least not on Malcolm (on hedgehog they are using this script).Eventually this will cause the disk to fill without external intervention.
We should provide a way to specify whether or not to prune these files, and to set that limit. It could be either a "don't let the extracted_files directory grow to beyond this size" (prune_files.sh doesn't support that right now, but it would be a good addition as an option for that script) or a "start pruning when the utilized disk space hits some high-water mark".