genomehubs / goat-data

MIT License
2 stars 0 forks source link

Add option to delete the old indexing data from the prod ES server #88

Closed gq1 closed 1 month ago

gq1 commented 1 month ago

I have updated the script on the VM to delete the old index data on ES server.

The only problem is that the index data list is based on the data on the disk not on the ES server.

If you manually delete the data on disk, you also need to delete the data on ES server.

I could do another query curl -s 'es1:9200/_cat/indices?v' but for now it should be fine.

I have manually clean the old data on ES server and currently only 8 days left there:

curl -s  'es1:9200/_cat/indices?v' | grep taxon--ncbi--goat--
yellow open   taxon--ncbi--goat--2024.09.09       eIxzlkQnTd-XsPoz3pFgLA   1   1   73448790     42463389      8.8gb          8.8gb
yellow open   taxon--ncbi--goat--2024.09.11       hlc1f5wdTfCTIfvoMfifcw   1   1   73451670     18646576      6.9gb          6.9gb
yellow open   taxon--ncbi--goat--2024.09.10       1xCfpTmmRbe5kTn7gKvMQw   1   1   73451448     16923937      6.8gb          6.8gb
yellow open   taxon--ncbi--goat--2022.11.16       Q_t44Eh_SlOe0hw9SsgQTg   1   1   69752556     18051898      6.2gb          6.2gb
yellow open   taxon--ncbi--goat--2023.02.20       vZgaVtN0TVuu1BZHxxYEmw   1   1   70610198     19286743      6.4gb          6.4gb
yellow open   taxon--ncbi--goat--2023.05.18       dKraRHIfTCWw6z-7MiCT_Q   1   1   71489417     16561045      6.3gb          6.3gb
yellow open   taxon--ncbi--goat--2023.10.16       Nir8Mv_BS8iTAu0chfqaqw   1   1   69163659     42111975      8.5gb          8.5gb
yellow open   taxon--ncbi--goat--2024.03.01       f3-6eeEHStO7Kn8PmAXQtg   1   1   68210681     24805816        7gb            7gb

This is the output from the clean_data script:

================
Checking done, in total 9!
The number of data dir to keep: 8
production-2024.09.11
production-2024.09.10
production-2024.09.09
production-2024.03.01
production-2023.10.16
production-2023.05.18
production-2023.02.20
production-2022.11.16
================
================
Delete data directory production-2024.01.20
Delete ES index for production-2024.01.20
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    21  100    21   {"acknowledged":true} 0     0  15086      0 --:--:-- --:--:-- --:--:-- 21000
================
In total 1 data directories are deleted!