AlexsLemonade / refinebio

Refine.bio harmonizes petabytes of publicly available biological data into ready-to-use datasets for cancer researchers and AI/ML scientists.
https://www.refine.bio/
Other
125 stars 18 forks source link

Cycling credentials causes deployment issues #3458

Open davidsmejia opened 7 months ago

davidsmejia commented 7 months ago

Context

Recently we did a production deploy that caused an issue with our pg_bouncer instance. It seems that pg_bouncer was unable to connect to the database because it was trying to connect with the previous release's credentials. This can be resolved by manually causing user-data to be run on restart.

Problem or idea

While we could force the user-data script to re-run on restart. We probably want to resolve this in a way that doesn't force us to make that change.

My current thinking is that we want to explicitly handle init and restart independently:

Additional thoughts: I think we should start this with pgbouncer initially but really all of our persistent servers should be able to recover from a restart correctly.

Solution or next step

Short term solution:

Long term solution: I think at this time I'd like to have a synchronous meeting around how exactly we want to manage this so that it would work for deploys or manual restarts. Presumably after/during terraform, the restart script would have the latest credentials and then we should be able to just restart the service (if a re-init didn't occur) so we know it is running with the latest credentials.