Open Innixma opened 10 months ago
The backup should be made in amlb/results.py#L112, called from the Benchmark, if I am not mistaken. Having an option is call save
with append=True
should be all it takes.
In the meantime, you could disable results.global_save
. Then no results/results.csv
will be written at all which should also mean no backup is made.
I am running large-scale benchmarks in AWS mode and finding that there are files being saved in
results/backup/
that take up significant space (leading to >1 TB of files that cause the host machine to run out of disk during the benchmark run).Where in the code are these files being specified and how can I disable them? Are they necessary for anything? I would assume not.
The problem is that each file in
backup
is concatenating all the results of the benchmark together into a CSV file, causing it to take N^2 space where N is the number of instances being spun up (and in my case, N > 20,000).As an example:
There are around 10 of these files being written a minute, each one larger than the last (currently 108MB per file), meaning 1 GB of disk space is being taken up a minute.