Open falkamelung opened 1 month ago
I am thinking about how to do this. I think you mentioned that this functionality should go in the ingest scripts. How many of the old tables would we remove?
I think in the same way as we have json_mbtiles2insarmaps.py --remove
we could have json_mbtiles2insarmaps.py --keep-space 10
.
Does the script know on which volume the postgres_dir
and mbtiles_dir
are located? If so it could run a df
command to find out the total and available size of the volume. If less than 10% space is available is should list the dataset, sort according to age, and then remove the oldest datasets until 10% free space is is reached again.
Maybe it just returns a list of datasets and then pipes them into json_mbtiles2insarmaps.py --remove
?
In our workflow we would run this always after the json_mbtiles2insarmaps.py
ingestion.
Once you get to this I can create an instance with a small disk for testing and prepare a few datasets on jetstream for ingestion.
Hi @stackTom And when you get a chance , please think about the 'old-data-removal-when-reaching-90%-disk-space` issue.
I would just check the size of the volume that contains /data/postgres_dir and then remove the oldest files with a loop over them executing
json_mbtiles*py --remove
. We may want to implement an INSARMAPS_DATA environment variable in the case there is no/data
, and runrun_dock.sh
using