geodesymiami / insarmaps

3 stars 0 forks source link

auto-removal when disk reaches capacity #94

Open falkamelung opened 1 month ago

falkamelung commented 1 month ago

Hi @stackTom And when you get a chance , please think about the 'old-data-removal-when-reaching-90%-disk-space` issue.

I would just check the size of the volume that contains /data/postgres_dir and then remove the oldest files with a loop over them executing json_mbtiles*py --remove. We may want to implement an INSARMAPS_DATA environment variable in the case there is no /data , and run run_dock.sh using

run_docker.sh  $INSARMAPS_DATA/postgres_dir $INSARMAPS_DATA/mbtiles_dir  149.165.168.186
stackTom commented 1 month ago

I am thinking about how to do this. I think you mentioned that this functionality should go in the ingest scripts. How many of the old tables would we remove?

falkamelung commented 1 month ago

I think in the same way as we have json_mbtiles2insarmaps.py --remove we could have json_mbtiles2insarmaps.py --keep-space 10.

Does the script know on which volume the postgres_dir and mbtiles_dir are located? If so it could run a df command to find out the total and available size of the volume. If less than 10% space is available is should list the dataset, sort according to age, and then remove the oldest datasets until 10% free space is is reached again.

Maybe it just returns a list of datasets and then pipes them into json_mbtiles2insarmaps.py --remove?

In our workflow we would run this always after the json_mbtiles2insarmaps.py ingestion.

Once you get to this I can create an instance with a small disk for testing and prepare a few datasets on jetstream for ingestion.