Separate secrets from the application config

ian-noaa commented 1 month ago

Motivation for the change

I'd like to separate secrets from the rest of the MATS application config. I'm hoping to do this so that we can:

work more effectively with secrets in AWS via AWS Secret Manager.
remove the need for the mats-settings repo and fold the application config into our IaC repository.
better reflect a 12-factor app approach to configuration.

Describe the change

I'd like to pull the database host, user, port, and password as well as the Mapbox API key out of our config files and inject them into the environment with environment variables. Technically, the DB port isn't a secret. However, treating it as one lets us control the DB connection from one location. Ideally, we'd be able to reuse the values in our various applications:

couchbase user, host, password, and port
mysql user, host, password, and port
mapbox key

The upside of doing this is that if we need to update a database user or the mapbox key for MATS, we can do it once, in one place, and the production applications will all pick up the new value.

From the application's standpoint, it will only need to look in the environment for these variables. We may want to use a dotenv package to search for variables to make development easiest. I'm imagining the following approaches for distributing secrets depending on the environment.

local development - we use the industry-standard dotenv approach.
local container - we use Docker's support for dotenv files. (docker run --env-file=.env)
Docker Compose - we use Docker Compose's dotenv support.
Kubernetes - in AWS we use Kubernetes Secrets to populate the container environment. The Kubernetes Secrets are populated via the External Secrets Operator, which gets values from the AWS Secrets Manager.

In all cases, the application solely looks for the environment variables and it's up to kubernetes/docker/the .env file to deliver the correct env variables.

After a quick survey of the existing secrets, I'd propose we have the application expect the below env variables. If we need more/bespoke env variables, I think we'd add them as needed:

MATS_MYSQL_USER
MATS_MYSQL_HOST
MATS_MYSQL_PASS
MATS_MYSQL_PORT
MATS_COUCHBASE_USER
MATS_COUCHBASE_HOST
MATS_COUCHBASE_PASS
MATS_COUCHBASE_PORT
MATS_MAPBOX_KEY

Using ceil-vis as an example, we would then update the settings.json to:

Example settings.json file

```json { "private": { "databases": [ { "role": "sums_data", "type": "mysql", "status": "active", "database": "ceiling_sums2", "connectionLimit": 4 }, { "role": "meta_data", "type": "mysql", "status": "active", "database": "mats_common", "connectionLimit": 1 }, { "role": "couchbase", "type": "couchbase", "status": "active", "bucket": "vxdata", "scope": "_default", "collection": "SCORECARD_SETTINGS", } ], "PYTHON_PATH": "/usr/bin/python3", }, "public": ... } ```

The main difference is that I've removed the secrets, and added a type field so we know which credentials to use. (It would match the MATS_<TYPE>_* in the middle of the DB env variable) Arguably, the type field is unneeded depending on how we use role.

At application startup, we could grab the env vars, error and quit if they don't exist, and either add them to the settings object or just pull them whenever we need to create a database connection string. Credentials in AWS can be automatically rotated, so it'd be good to handle authorization errors and recheck the env vars a few times if authenticating fails.

mollybsmith-noaa commented 1 month ago

Just as a note, you will need the METexpress credentials as well to run our full state of apps. Otherwise this looks good.

On Mon, Sep 30, 2024 at 10:49 AM Ian McGinnis @.***> wrote:

Motivation for the change

I'd like to separate secrets from the rest of the MATS application config. I'm hoping to do this so that we can:

work more effectively with secrets in AWS via AWS Secret Manager.

remove the need for the mats-settings repo and fold the application config into our IaC repository.

better reflect a 12-factor app approach to configuration https://12factor.net/config.

Describe the change

I'd like to pull the database host, user, port, and password as well as the Mapbox API key out of our config files and inject them into the environment with environment variables. Technically, the DB port isn't a secret. However, treating it as one lets us control the DB connection from one location. Ideally, we'd be able to reuse the values in our various applications:

couchbase user, host, password, and port

mysql user, host, password, and port

mapbox key

The upside of doing this is that if we need to update a database user or the mapbox key for MATS, we can do it once, in one place, and the production applications will all pick up the new value.

From the application's standpoint, it will only need to look in the environment for these variables. We may want to use a dotenv package to search for variables to make development easiest. I'm imagining the following approaches for distributing secrets depending on the environment.

local development - we use the industry-standard dotenv approach https://www.npmjs.com/package/dotenv.

local container - we use Docker's support for dotenv files. (docker run --env-file=.env https://docs.docker.com/reference/cli/docker/container/run/#env)

Docker Compose - we use Docker Compose's dotenv support https://docs.docker.com/compose/how-tos/environment-variables/set-environment-variables/#use-the-env_file-attribute .

Kubernetes - in AWS we use Kubernetes Secrets https://kubernetes.io/docs/tasks/inject-data-application/distribute-credentials-secure/#define-container-environment-variables-using-secret-data to populate the container environment. The Kubernetes Secrets are populated via the External Secrets Operator https://external-secrets.io/latest/introduction/overview/, which gets values from the AWS Secrets Manager https://docs.aws.amazon.com/secretsmanager/latest/userguide/intro.html .

In all cases, the application solely looks for the environment variables and it's up to kubernetes/docker/the .env file to deliver the correct env variables.

After a quick survey of the existing secrets, I'd propose we have the application expect the below env variables. If we need more/bespoke env variables, I think we'd add them as needed:

MATS_MYSQL_USER

MATS_MYSQL_HOST

MATS_MYSQL_PASS

MATS_MYSQL_PORT

MATS_COUCHBASE_USER

MATS_COUCHBASE_HOST

MATS_COUCHBASE_PASS

MATS_COUCHBASE_PORT

MATS_MAPBOX_KEY

Using ceil-vis as an example, we would then update the settings.json to: Example settings.json file

{ "private": { "databases": [ { "role": "sums_data", "type": "mysql", "status": "active", "database": "ceiling_sums2", "connectionLimit": 4 }, { "role": "meta_data", "type": "mysql", "status": "active", "database": "mats_common", "connectionLimit": 1 }, { "role": "couchbase", "type": "couchbase", "status": "active", "bucket": "vxdata", "scope": "_default", "collection": "SCORECARD_SETTINGS", } ], "PYTHON_PATH": "/usr/bin/python3", }, "public": ... }

The main difference is that I've removed the secrets, and added a type field so we know which credentials to use. (It would match the MATS* in the middle of the DB env variable) Arguably, the type field is unneeded depending on how we use role.

At application startup, we could grab the env vars, error and quit if they don't exist, and either add them to the settings object or just pull them whenever we need to create a database connection string. Credentials in AWS can be automatically rotated, so it'd be good to handle authorization errors and recheck the env vars a few times if authenticating fails.

— Reply to this email directly, view it on GitHub https://github.com/NOAA-GSL/MATS/issues/1212, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHBWGZP4LAOVID53SAAN7ADZZF6HVAVCNFSM6AAAAABPDYB7TWVHI2DSMVQWIX3LMV43ASLTON2WKOZSGU2TOMJYGAYTKMQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

ian-noaa commented 1 month ago

Ah, great point. I suspect it'd make most sense to handle METexpress credentials separately, like below. This way both applications can evolve separately, and have bespoke application users if needed. What do you think?

METEXPRESS_MYSQL_USER
METEXPRESS_MYSQL_HOST
METEXPRESS_MYSQL_PASS
METEXPRESS_MYSQL_PORT
METEXPRESS_MAPBOX_KEY

If we need a couchbase user, we can then add METEXPRESS_COUCHBASE_* like the MATS_COUCHBASE_*.

The only situation where this wouldn't work is if an application needs access to both the MATS & METEXPRESS MySQL databases. But then we'd just make up some type field for the database.

mollybsmith-noaa commented 1 month ago

Yeah, I think this looks good for now. I don’t know why METexpress would ever need to access the MATS database instance.

On Mon, Sep 30, 2024 at 11:51 AM Ian McGinnis @.***> wrote:

Ah, great point. I suspect it'd make most sense to handle METexpress credentials separately, like below. This way both applications can evolve separately, and have bespoke application users if needed. What do you think?

METEXPRESS_MYSQL_USER

METEXPRESS_MYSQL_HOST

METEXPRESS_MYSQL_PASS

METEXPRESS_MYSQL_PORT

METEXPRESS_MAPBOX_KEY

If we need a couchbase user, we can then add METEXPRESSCOUCHBASE like the MATSCOUCHBASE.

The only situation where this wouldn't work is if an application needs access to both the MATS & METEXPRESS MySQL databases. But then we'd just make up some type field for the database.

— Reply to this email directly, view it on GitHub https://github.com/NOAA-GSL/MATS/issues/1212#issuecomment-2383825311, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHBWGZKHSMUD22LJ6DR5K7DZZGFRXAVCNFSM6AAAAABPDYB7TWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOBTHAZDKMZRGE . You are receiving this because you commented.Message ID: @.***>

ian-noaa commented 1 month ago

One other note - env variables can be accessed in Meteor like:

if (Meteor.isServer) {
  console.log(process.env.MY_ENV);
}

NOAA-GSL / MATS

Separate secrets from the application config #1212

Motivation for the change

Describe the change