opentrials / opentrials

OpenTrials is an app to explore, discover, and submit information on clinical trials.
http://explorer.opentrials.net/
63 stars 24 forks source link

Unified deployment #31

Closed roll closed 7 years ago

roll commented 8 years ago

We use different environment variables across the whole project. It's used by apps (via .env files), set on Travis CI/CD, sometimes exported (set -a; source .env) to dev env for docker-compose.

I suppose the good practice will be to have some naming conventions like:

The main target is to have env variables ready to live in any namespace without collision. Like OPENTRIALS_DATABASE_URL_STAGING will be unique.

For example scraper and warehouse use for now:

OPENTRIALS_WAREHOUSE_URL
OPENTRIALS_DATABASE_URL
OPENTRIALS_ICTRP_USER
OPENTRIALS_ICTRP_PASS
roll commented 8 years ago

cc @vitorbaptista @pwalsh

vitorbaptista commented 8 years ago

Considering that the deployments are isolated inside Docker or Heroku, do we need to add a OPENTRIALS_ prefix? I mean, there won't be another app running in that environment, so we should never have name collisions (if we do, there's a bug somewhere).

Also, we shouldn't add the current environment name to the env name as in OPENTRIALS_DATABASE_URL_STAGING, but have one canonical like OPENTRIALS_DATABASE_URL.

pwalsh commented 8 years ago

@vitorbaptista several containers can be running on one host/cluster. the env vars are true across the host/cluster on Docker Cloud (hope I'm not mistaken there). So, we may need that. @roll can you confirm?

roll commented 8 years ago

The main reason we need this because docker-compose doesn't properly support .env files. So for run docker-compose locally developer should export environment variables to the local bash process (here will be collisions with other projects).

For production it's fine to do not have prefixes but for dev envs is highly recommended. Also for me it's just a good practice to start with prefixes to do not change everything after we will run into some new uses case (change a deployment system etc) where prefixes are required.

roll commented 8 years ago

Also, we shouldn't add the current environment name to the env name as in OPENTRIALS_DATABASE_URL_STAGING, but have one canonical like OPENTRIALS_DATABASE_URL.

With docker-cloud workflow I suppose we should because all our environment variables will be stored here - https://travis-ci.org/opentrials/api/settings - nothing else will go to app containers. So it looks like to support staging env we need to pass variables with env modifier and manage it on app level like:

# settings
if production
     use production env var

I really didn't discover it for now.

vitorbaptista commented 8 years ago

I have no experience with docker-compose, just starting to see how things fit together with Tutum, but they do seem to support .env files (as per https://docs.docker.com/compose/compose-file/). Also, the environment variables shouldn't be "global", but only applied to the server process. There's no harm in using the prefixes, though, so I'm happy to add them.

With docker-cloud workflow I suppose we should because all our environment variables will be stored here - https://travis-ci.org/opentrials/api/settings - nothing else will go to app containers. So it looks like to support staging env we need to pass variables with env modifier and manage it on app level like:

# settings
if production
     use production env var

The plan is to automatically deploy to both staging and production with Travis? I'd be a bit worried in that case, as I don't think our test suite is up to that level. It would need, for example, some end-to-end tests. What I usually have seen from my experience with Continuous Deployment was to deploy automatically to a staging site only. If staging seems good, it can then be promoted to production manually (as in clicking a button).

Also, from the 12 Factor App page about configs (http://12factor.net/config):

Another aspect of config management is grouping. Sometimes apps batch config into named groups (often called “environments”) named after specific deploys, such as the development, test, and production environments in Rails. This method does not scale cleanly: as more deploys of the app are created, new environment names are necessary, such as staging or qa. As the project grows further, developers may add their own special environments like joes-staging, resulting in a combinatorial explosion of config which makes managing deploys of the app very brittle.

In a twelve-factor app, env vars are granular controls, each fully orthogonal to other env vars. They are never grouped together as “environments”, but instead are independently managed for each deploy. This is a model that scales up smoothly as the app naturally expands into more deploys over its lifetime.

We sometimes need to check the current environment (for example, to disable debugging when in production), but these checks are minimal and can use the ENV variable. For things like database, etc., we'd better have a single variable name and change it as needed.

Couldn't these ENV variables live in Tutum cloud instead of Travis? This way Travis will only be responsible of deploying to Tutum, but after that all configuration takes place over there.

roll commented 8 years ago

@vitorbaptista Yes of course it will be ENV var:

stacks/api.yml

server:
  image: opentrialsrobot/api
  restart: always
  command: ENV=PRODUCTION node --use_strict server.js
  environment:
    OPENTRIALS_DATABASE_URL:
    OPENTRIALS_SEARCHENGINE_URL:

stacks/api--staging.yml

server:
  image: opentrialsrobot/api
  restart: always
  command: ENV=STAGING node --use_strict server.js
  environment:
    OPENTRIALS_DATABASE_URL:
    OPENTRIALS_SEARCHENGINE_URL:

Problem here is docker-cloud don't have env vars settings section (because of containers nature) like heroku so all information should be passed while Travis deploy. We have only one travis project for repository - it means we have to pass sensitive data with environment variables with markers (prod, stage) just to put all of our credentials to tutum stacks.

About deployment for now it deploys only on commits with [delpoy] mark in commit message. With different envs we could have [staging] and [production] marks. And even this is not a real deployment (could be changed by us) - we still need to push the redeploy button on Tutum (after make-database-migration stack run).

There are many options how to make it (prod/stage). Let return to it when we deploy something so you'll be seeing how it works on example.

roll commented 8 years ago

Sorry second listing was for stacks/api--staging.yml (fixed).

pwalsh commented 8 years ago

@roll you can set env vars in the Docker Cloud UI, just like with Heroku, so there is no need to hard code values in codebase, as far as I see.

roll commented 8 years ago

@pwalsh how? only in stack definitions AFAIK - I've setup travis to do it automatically.

It's not about hard-coding in codebase of course (it's our credentials=). Everything live here - https://travis-ci.org/opentrials/processors/settings - to create stacks with this values.

pwalsh commented 8 years ago

yes, you are right, only as part of a stack definition.

roll commented 8 years ago

High-level:

With this approach we only have one place to store all our env vars - travis settings. So we don't have a place to have env for staging. So it should store both stage/prod cred suffixed as I see for now.

But it's only a detail of deployment technique - after all we still have 12 Factor App compatible deployment (we will be having api and api--staging app with ENV var correctly set like 2 distinct apps on heroku). Just some adjustments to work the containerized way.

roll commented 8 years ago

Morning) As I said we was discussing just a details. So I've figured out how to add some details to make it fully 12 Factor App like:

  1. In stacks directory we having stacks without environment modifiers like:

stacks/api.yml

server:
  image: opentrialsrobot/api
  restart: always
  command: node --use_strict server.js
  # no values means read from environment on docker-compose or docker-cloud call
  environment:
    ENV:
    OPENTRIALS_DATABASE_URL:
    OPENTRIALS_SEARCHENGINE_URL:
  1. At travis we have all creadentials set to secure env vars in settings:

https://travis-ci.org/opentrials/processors/settings

OPENTRIALS_DATABASE_URL=***
OPENTRIALS_DATABASE_URL__STAGING=***
OPENTRIALS_SEARCHENGINE_URL=***
OPENTRIALS_SEARCHENGINE_URL__STAGING=***
  1. In travis file we have deployment router based on commit message (could be rebased on tags or branches - any method we will prefer)

.travis.yml

deploy:
  # run on commit like `commit [production]`
  - provider: script
    script: bash scripts/deploy.sh
    on:
      condition: '`git show -s --format=%B ${TRAVIS_COMMIT} | grep "\[production\]"`'
  # run on commit like `commit [staging]`
  # or we could have here deploy always
  - provider: script
    script: bash scripts/deploy--staging.sh
    on:
      condition: '`git show -s --format=%B ${TRAVIS_COMMIT} | grep "\[staging\]"`'
  1. In scripts folder we have this deployment scripts with environment modifiers:

scripts/deploy.sh

export ENV=PRODUCTION
# put api to `api` stack on tutum (could be automated for set of stacks)
# it could be not fully automated deploy with
# human interaction in Tutum dashboard - TBD

scripts/deploy--staging.sh

export ENV=STAGING
export OPENTRIALS_DATABASE_URL=OPENTRIALS_DATABASE_URL__STAGING
export OPENTRIALS_SEARCHENGINE_URL=OPENTRIALS_SEARCHENGINE_URL__STAGING
# put api to `api--staging` stack on tutum (could be automated for set of stacks)
# it could be fully automated deploy
# with `tutum` CLI we could run migration
# and redeploy running stacks - TBD

So voila it's full analogy with heroku approach.

About env vars prefixing - really don't now. With this approach it's just developer machine potential collisions problem. If we (developers) could live with it - we don't need prefixes.

vitorbaptista commented 8 years ago

@roll What if we kept the environment variables in Docker Cloud instead of Travis? I think we can avoid the problems that your solutions number 2 and 4 are fixing.

Say we keep the file stacks/api.yml as you describe, but when deploying on Travis, instead of fixing the env variables and running tutum CLI, we simply build the docker image and push to Docker Hub. "Deployment", in this context, is building and pushing the docker image.

We can configure Docker Hub to hit a Tutum webhook when a new image is published, triggering a rebuild of the containers on their side (Tutum's). This is the deployment in the usual "sending new code to a server" sense. Here, the deployment is finished.

I drew a flowchart to make it easier to explain:

opentrials_deployment_process

Read it from top-left to botom-right. Notice that the only mention of the environment (staging or production) is when tagging an image as "latest" or "staging". There're no secrets like databases urls here. These values are in the Docker Cloud configuration.

For it to work, we would need to do some preparation work on Docker Cloud that (AFAIK) can't be automated. We need to create the stack using the stacks/api.yml as base, but adding the environment variables ourselves. We then need to configure it to receive the webhook from Docker Hub whenever a new image is pushed.

After this is configured, we just need to make sure any changes to stacks/api.yml are also changed in Docker Cloud, as they aren't updated automatically. This is the drawback of this process.

On the other hand, we keep everything related to the configuration of the containers (i.e. their environment) in Docker Cloud.

I think this is simpler and can be more easily expanded to more than 2 environments (say staging, master, production). This is not be that important for Open Trials, but as we're trying to replicate this process in other projects, this might come in handy.

What do you think?

pwalsh commented 8 years ago

@roll @vitorbaptista we are moving to Heroku for main app deployment, and we'll keep using Docker Cloud for the collectors. I'll keep this issue open on the backlog.

vitorbaptista commented 7 years ago

As we're using Heroku now, probably moving to Deis in the future, I don't think this issue is useful anymore. I'm closing it, but feel free to reopen if you think otherwise.