datamade / how-to

📚 Doing all sorts of things, the DataMade way
MIT License
88 stars 12 forks source link

Document evolving thoughts on staging and production instances #217

Closed hancush closed 11 months ago

hancush commented 3 years ago

Documentation request

In the past few weeks, we've had to reckon with the tension between our theory that staging sites should be volatile and our practice of using staging sites to create content and data that is then promoted to production.

We would like to align our practices more with the theory of volatility. This has important implications for when we create a production database, and the resources we configure (or don't) for staging.

Loose notes on changes:

Create or expand the appropriate documentation, with links to helpful configuration (like the one that checks for AWS creds and falls back to local storage if they aren't provided).

hancush commented 3 years ago
  1. Research drafting workflow (Wagtail)
  2. Wagtail/Django setting to fall back to local storage if S3 not configured (related to #143)
  3. Create S3 user/policy for use by staging sites, if needed
hancush commented 3 years ago
  1. Strategy for auto-flushing staging databases at regular interval to enforce volatility
hancush commented 3 years ago

Helpful docs on how to create security policy, IAM user, and S3 bucket c/o @smcalilly: https://gist.github.com/smcalilly/2a14731e3075a27de6dc36cf0e75d1e1

smcalilly commented 11 months ago

We've separated staging and production on the same resources for the Agenda Watch project. We're using the same Postgres instance with separate staging and production databases within that instance. We're also doing this with Elasticsearch indexes.

This works fine but it requires the code base to have knowledge of the environment. This makes it harder for me to reason about when configuring builds and writing the code to work with the build so it will use the desired "environment". This was a tradeoff we made to reduce hosting costs, since it costs more to spin up completely separate instances per environment.

Sometimes this tradeoff is necessary, but in most cases a code base should have no knowledge of an environment.