NOAA-GSL / MATS

MATS is a quick & interactive way to view verification statistics
https://gsl.noaa.gov/mats/
6 stars 0 forks source link

Separate secrets from the application config #1212

Open ian-noaa opened 1 month ago

ian-noaa commented 1 month ago

Motivation for the change

I'd like to separate secrets from the rest of the MATS application config. I'm hoping to do this so that we can:

  1. work more effectively with secrets in AWS via AWS Secret Manager.
  2. remove the need for the mats-settings repo and fold the application config into our IaC repository.
  3. better reflect a 12-factor app approach to configuration.

Describe the change

I'd like to pull the database host, user, port, and password as well as the Mapbox API key out of our config files and inject them into the environment with environment variables. Technically, the DB port isn't a secret. However, treating it as one lets us control the DB connection from one location. Ideally, we'd be able to reuse the values in our various applications:

The upside of doing this is that if we need to update a database user or the mapbox key for MATS, we can do it once, in one place, and the production applications will all pick up the new value.

From the application's standpoint, it will only need to look in the environment for these variables. We may want to use a dotenv package to search for variables to make development easiest. I'm imagining the following approaches for distributing secrets depending on the environment.

In all cases, the application solely looks for the environment variables and it's up to kubernetes/docker/the .env file to deliver the correct env variables.

After a quick survey of the existing secrets, I'd propose we have the application expect the below env variables. If we need more/bespoke env variables, I think we'd add them as needed:

Using ceil-vis as an example, we would then update the settings.json to:

Example settings.json file

```json { "private": { "databases": [ { "role": "sums_data", "type": "mysql", "status": "active", "database": "ceiling_sums2", "connectionLimit": 4 }, { "role": "meta_data", "type": "mysql", "status": "active", "database": "mats_common", "connectionLimit": 1 }, { "role": "couchbase", "type": "couchbase", "status": "active", "bucket": "vxdata", "scope": "_default", "collection": "SCORECARD_SETTINGS", } ], "PYTHON_PATH": "/usr/bin/python3", }, "public": ... } ```

The main difference is that I've removed the secrets, and added a type field so we know which credentials to use. (It would match the MATS_<TYPE>_* in the middle of the DB env variable) Arguably, the type field is unneeded depending on how we use role.

At application startup, we could grab the env vars, error and quit if they don't exist, and either add them to the settings object or just pull them whenever we need to create a database connection string. Credentials in AWS can be automatically rotated, so it'd be good to handle authorization errors and recheck the env vars a few times if authenticating fails.

mollybsmith-noaa commented 1 month ago

Just as a note, you will need the METexpress credentials as well to run our full state of apps. Otherwise this looks good.

On Mon, Sep 30, 2024 at 10:49 AM Ian McGinnis @.***> wrote:

Motivation for the change

I'd like to separate secrets from the rest of the MATS application config. I'm hoping to do this so that we can:

  1. work more effectively with secrets in AWS via AWS Secret Manager.
  2. remove the need for the mats-settings repo and fold the application config into our IaC repository.
  3. better reflect a 12-factor app approach to configuration https://12factor.net/config.

Describe the change

I'd like to pull the database host, user, port, and password as well as the Mapbox API key out of our config files and inject them into the environment with environment variables. Technically, the DB port isn't a secret. However, treating it as one lets us control the DB connection from one location. Ideally, we'd be able to reuse the values in our various applications:

  • couchbase user, host, password, and port
  • mysql user, host, password, and port
  • mapbox key

The upside of doing this is that if we need to update a database user or the mapbox key for MATS, we can do it once, in one place, and the production applications will all pick up the new value.

From the application's standpoint, it will only need to look in the environment for these variables. We may want to use a dotenv package to search for variables to make development easiest. I'm imagining the following approaches for distributing secrets depending on the environment.

In all cases, the application solely looks for the environment variables and it's up to kubernetes/docker/the .env file to deliver the correct env variables.

After a quick survey of the existing secrets, I'd propose we have the application expect the below env variables. If we need more/bespoke env variables, I think we'd add them as needed:

  • MATS_MYSQL_USER
  • MATS_MYSQL_HOST
  • MATS_MYSQL_PASS
  • MATS_MYSQL_PORT
  • MATS_COUCHBASE_USER
  • MATS_COUCHBASE_HOST
  • MATS_COUCHBASE_PASS
  • MATS_COUCHBASE_PORT
  • MATS_MAPBOX_KEY

Using ceil-vis as an example, we would then update the settings.json to: Example settings.json file

{ "private": { "databases": [ { "role": "sums_data", "type": "mysql", "status": "active", "database": "ceiling_sums2", "connectionLimit": 4 }, { "role": "meta_data", "type": "mysql", "status": "active", "database": "mats_common", "connectionLimit": 1 }, { "role": "couchbase", "type": "couchbase", "status": "active", "bucket": "vxdata", "scope": "_default", "collection": "SCORECARD_SETTINGS", } ], "PYTHON_PATH": "/usr/bin/python3", }, "public": ... }

The main difference is that I've removed the secrets, and added a type field so we know which credentials to use. (It would match the MATS* in the middle of the DB env variable) Arguably, the type field is unneeded depending on how we use role.

At application startup, we could grab the env vars, error and quit if they don't exist, and either add them to the settings object or just pull them whenever we need to create a database connection string. Credentials in AWS can be automatically rotated, so it'd be good to handle authorization errors and recheck the env vars a few times if authenticating fails.

— Reply to this email directly, view it on GitHub https://github.com/NOAA-GSL/MATS/issues/1212, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHBWGZP4LAOVID53SAAN7ADZZF6HVAVCNFSM6AAAAABPDYB7TWVHI2DSMVQWIX3LMV43ASLTON2WKOZSGU2TOMJYGAYTKMQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

ian-noaa commented 1 month ago

Ah, great point. I suspect it'd make most sense to handle METexpress credentials separately, like below. This way both applications can evolve separately, and have bespoke application users if needed. What do you think?

If we need a couchbase user, we can then add METEXPRESS_COUCHBASE_* like the MATS_COUCHBASE_*.

The only situation where this wouldn't work is if an application needs access to both the MATS & METEXPRESS MySQL databases. But then we'd just make up some type field for the database.

mollybsmith-noaa commented 1 month ago

Yeah, I think this looks good for now. I don’t know why METexpress would ever need to access the MATS database instance.

On Mon, Sep 30, 2024 at 11:51 AM Ian McGinnis @.***> wrote:

Ah, great point. I suspect it'd make most sense to handle METexpress credentials separately, like below. This way both applications can evolve separately, and have bespoke application users if needed. What do you think?

  • METEXPRESS_MYSQL_USER
  • METEXPRESS_MYSQL_HOST
  • METEXPRESS_MYSQL_PASS
  • METEXPRESS_MYSQL_PORT
  • METEXPRESS_MAPBOX_KEY

If we need a couchbase user, we can then add METEXPRESSCOUCHBASE like the MATSCOUCHBASE.

The only situation where this wouldn't work is if an application needs access to both the MATS & METEXPRESS MySQL databases. But then we'd just make up some type field for the database.

— Reply to this email directly, view it on GitHub https://github.com/NOAA-GSL/MATS/issues/1212#issuecomment-2383825311, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHBWGZKHSMUD22LJ6DR5K7DZZGFRXAVCNFSM6AAAAABPDYB7TWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOBTHAZDKMZRGE . You are receiving this because you commented.Message ID: @.***>

ian-noaa commented 1 month ago

One other note - env variables can be accessed in Meteor like:

if (Meteor.isServer) {
  console.log(process.env.MY_ENV);
}