isomerpages / isomercms-backend

A static website builder and host for the Singapore Government
5 stars 1 forks source link

chore: add new env vars for db #712

Closed alexanderleegs closed 1 year ago

alexanderleegs commented 1 year ago

Problem

This PR adds new env vars which allow us to toggle our DB settings more easily, in response to the frequent SequelizeConnectionAcquireTimeoutError we're been encountering on prod.

New env vars added:

alexanderleegs commented 1 year ago

no issues with PR itself but more broadly, i do have a few comments.

  1. this issue stems from resource starvation - we have a limited number of connections and the time taken to do work w/ 1 of those connections is q long, which leads to time outs. this is a temporary fix so we should ideally file an issue on jira for longer term tracking

Agreed, filed an issue for it!

  1. given that now we pre-emptively terminate connections, will there be any errors thrown/returned back to the client? because this is a db related issue, the errors are unlikely to be useful/helpful and we should take care to transform them into something human readable for our end user as they are going to be the ones affected

I'm not sure if any of our existing connections get pre-emptively terminated - this kicks out idle sessions within open connections after 30 seconds, so I think this should only affect anything if we're not handling our transactions properly?

  1. (not directly PR related) Given the growing size of our env-vars, how can we better manage them? this is an issue when we want to set up/migrate environments (eg: VAPT) and a growing env var size means that it's harder to track

Agreed, happy to hear any suggestions on this!

harishv7 commented 1 year ago

no issues with PR itself but more broadly, i do have a few comments.

  1. this issue stems from resource starvation - we have a limited number of connections and the time taken to do work w/ 1 of those connections is q long, which leads to time outs. this is a temporary fix so we should ideally file an issue on jira for longer term tracking
  2. given that now we pre-emptively terminate connections, will there be any errors thrown/returned back to the client? because this is a db related issue, the errors are unlikely to be useful/helpful and we should take care to transform them into something human readable for our end user as they are going to be the ones affected
  3. (not directly PR related) Given the growing size of our env-vars, how can we better manage them? this is an issue when we want to set up/migrate environments (eg: VAPT) and a growing env var size means that it's harder to track

For points:

  1. Agree, it requires some deeper investigation on the root cause

  2. I feel convict has helped us manage the growing env vars much better already - we have moved towards a "configuration" management file more than environment vars file now. We have them manageable in a JSON structure, and we have grouped them in a logical hierarchy with parent keys. We are also able to identify which ones are the mandatory ones and optional ones. Is there anything else missing especially for the part on "set up/migrate environments (eg: VAPT)" and " harder to track"?

seaerchin commented 1 year ago

@alexanderleegs see

DB_ACQUIRE The maximum time, in milliseconds, that pool will try to get connection before throwing error currently set to 60000, will increase to 120000

we don't have anything to handle this, which is my concern