Describe the issue
MoH KE provisioned 1TB storage for the postgres database, in a postgres managed by kubernetes
The persistent volume for this cannot be easily resized.
1TB was not enough to store the entire postgres database, and the initial dbt run to populate the database was failing.
To get the dashboards working as quickly as possible, all documents with type 'task' were deleted.
Since tasks accounted for 2/3 of the storage usage (before material views), this freed up enough space to allow the initial dbt run to complete
However, tasks are still useful for analytics and debugging, and deleting them is not a permanent solution.
Describe the improvement you'd like
When all the materialized views are finished, calculate exact storage requirements by adding a buffer of 1/2 (for temp tables and later additions) to the storage usage when everything is finished
Create a new postgres instance with at least this much storage; either it could be a new node in the cluster, or, if the external postgres has enough free storage, switch to using that
Create a new cht-sync instance using that postgres. It will take some time (about one week) to do the initial run, so leave the old one running.
when it is finished, switch dashboards to the new postgres and stop and delete the old cht-sync
Describe alternatives you've considered
There's other ways to get the old tasks, I think this will be the simplest overall.
We could try and save time by pgdump/restore the current database, but it then gets complicated with how merge the old tasks
To save time for the inital dbt run, pgdump and restore to the new postgres, excluding the source table. This may take a long time but cht-sync can keep updating while that is running.
in the new database, truncate couch2pg_progress
Run a new instance of cht-sync, without dbt, poitning to the new database. This will sync
when its finished (1-3 days), stop it and restore couch2pg_progress from the saved table
Switch the normal cht-sync to the new postgres. This will cause dashboards to be out of date again; it's starting over from where it was at step 4. That should be only a few days (at most week) of data so it should catch up quickly. couch2pg will try to sync some tasks that are already in the db; it will update saved_timestamp and treat them like updates, but it will not create duplicates.
I think going with the simple solution is the better way, and involves less human intervention, less steps and less potential ways where this can go wrong.
Thanks for writing this up
Describe the issue MoH KE provisioned 1TB storage for the postgres database, in a postgres managed by kubernetes The persistent volume for this cannot be easily resized. 1TB was not enough to store the entire postgres database, and the initial dbt run to populate the database was failing. To get the dashboards working as quickly as possible, all documents with type 'task' were deleted. Since tasks accounted for 2/3 of the storage usage (before material views), this freed up enough space to allow the initial dbt run to complete However, tasks are still useful for analytics and debugging, and deleting them is not a permanent solution.
Describe the improvement you'd like
Describe alternatives you've considered There's other ways to get the old tasks, I think this will be the simplest overall. We could try and save time by pgdump/restore the current database, but it then gets complicated with how merge the old tasks