Open bhgrant8 opened 6 years ago
Helluva good thought Brian, I’ve been meaning to do something similar, so I’ll contribute here instead.
First Observation: The .travis.yml Files
Looking over .travis.yml files
for the last year, all projects seemed to follow a basic pattern, will copy the Team Budget example here:
sudo: required
services:
- docker
install:
- pip install --upgrade --user awscli
before_script:
- ./budget_proj/bin/getconfig.sh
script:
- './budget_proj/bin/test-proj.sh -t'
after_success:
- ./budget_proj/bin/docker-push.sh
Breaking this down:
sudo: required
- allows travis commands to be run as sudo in the environment
services:
- docker
install:
- pip install --upgrade --user awscli
So two things here, first install
is a step in Travis's build lifecycle - this step will install any dependencies we will need on an os level, basically this will be more related to aws/deploy and not anything needed within the docker container or django project.
the pip install command is installing the aws-cli which will be used to deploy the built container to AWS ECR (Elastic Container Registry)
before_script:
- ./budget_proj/bin/getconfig.sh
before_script
is next step in build cycle. If this script ends with a non-zero code, it will break the build cycle immediately. So any config that needs to happen and needs to be successful would go into this stepgetconfig.sh
script was a pattern used last year to pull database and other python variables into the docker container, we are most likely moving away from this patternscript:
- './budget_proj/bin/test-proj.sh -t'
The script command is the bulk of the work of travis. It should include any project build and test tasks. If it exits as non-zero, it will be marked as a failed build but will continue to run through to the after_failure step
The test-proj.sh script builds the containers then runs the test-entrypoint.sh script which runs the tests
after_success:
- ./budget_proj/bin/docker-push.sh
Provided the script command ends with a successful code, the after_success
command will be run.
The docker-push.sh file, essentially verified whether the current build was based on a PR request to the Master branch. Only in this case would it then run the ecs-deploy script which would take the successfully build and tested containers to AWS services using the awscli client that was installed.
So we seem to see 3 main tasks in this:
Observations:
Overall this seems like a fairly basic pattern to continue to use, unless we see a specific case, as in removing the getconfig complexities. There may be some opportunities to use more of the build lifecycle steps to our advantage. For example, possible alerting on after_failure? The "deploy" step is intriguing, however I believe only works if you are using a supported deploy provider, which i am not sure we fit into.
Last years examples:
where / when does the "docker-compose" happen, if it does?
Build and Testing Scripts
docker-compose
operations were run as part of the "build-test" script run in the "script" step of the travis build cycle, thus will feed into either "after_success" or "after_failure"
One example again is the "team-budget" script. (as budget situated the api within a sub-directory in the repo, many of the example scripts includes the $PROJ_SETTINGS_DIR in directory paths. this would not be used in examplar, where api is within root of repo):
https://github.com/hackoregon/team-budget/blob/master/budget_proj/bin/test-proj.sh
# Run all configured unit tests inside the Docker container
while getopts ":lt" opt; do
case "$opt" in
l)
docker-compose -f $PROJ_SETTINGS_DIR/local-docker-compose.yml build
docker-compose -f $PROJ_SETTINGS_DIR/local-docker-compose.yml run \
--entrypoint /code/bin/test-entrypoint.sh $DOCKER_IMAGE
;;
t)
docker-compose -f $PROJ_SETTINGS_DIR/travis-docker-compose.yml build
docker-compose -f $PROJ_SETTINGS_DIR/travis-docker-compose.yml run \
--entrypoint /code/bin/test-entrypoint.sh $DOCKER_IMAGE
;;
*)
usage
;;
esac
done
So we see two flags:
-l
- this is the local build and test connects to the local image-t
- this is the travis build and test connects to the travis imageWhile building the images pulls different compose files we do use the same test-entrypoint.sh
in each environment (removed commented out lines for clarity):
#!/bin/bash
export PATH=$PATH:~/.local/bin
python manage.py test --no-input --keepdb
i am not sure completely why we needed to update the PATH?
in terms of the script:
python manage.py test --no-input --keepdb
we see the basic manage.py test
being run. Then --no-input
, to prevent the script prompting user for input, allowing to be run automatically.
Most important is the --keepdb
moniker, meaning that the database that the tests are run on is persistent from one test to the next. Emergency Response followed this pattern as well. Emergency Response ran all tests as read-only against the production database, still have to look at budget to see if this is same (future post)
Observations
We are using the same script to accomplish 2 tasks: building a container, then testing it. There is an entrypoint script to overide the default entrypoint that is run in the containers for the docker-compose up. We may need to use a --noinput
flag to make sure the script does not stop and wait for user input. Connecting to a persistent database, is a path some projects used. When doing so, no migrations were run to prevent any changes to the db.
Other Examples
So this is one area there is some differentiation, that are worth looking into:
Transportation - This project does not use the "keepdb" option, so looks like was spinning up a test database on each run
Homeless - Mostly same as Transportation
Housing - Housing used py.test instead of built in testing suite, not too familar with py.test, but might have some specific value?
Ah - so the actual work is done in shell scripts, not in travis.yml
.
yeah, in our setup, i think once you get past very simple commands, doing so makes things a bit easier.
Testing Database Connections
So continuing to work through the testing setup, before we get to tests themselves being run, lets look at the datastores that teams are connecting to for testing, and how.
Emergency Response
Starting here as I know the most.
When I came into the program to start building the API, we had a fairly developed database already live on AWS. I was given read-only creds to the prod AWS database, after hacking around some options, I ended up configuring my tests to run against the production database, as it was not creating or deleting any data.
This strategy involved:
--keepdb
flag used in the test command mentioned abovetest_
to the database name. Instead I pointed back to the default db, when tests was in the sys.argv:
https://github.com/hackoregon/emergency-response-backend/blob/master/emerresponseAPI/settings.py#L129if 'test' in sys.argv or 'test_coverage' in sys.argv:
DATABASES = {
'default': {
'ENGINE': project_config.AWS['ENGINE'],
'NAME': project_config.AWS['NAME'],
'HOST': project_config.AWS['HOST'],
'PORT': 5432,
'USER': project_config.AWS['USER'],
'PASSWORD': project_config.AWS['PASSWORD'],
'TEST': {
'NAME': 'fire',
},
}
}
Team Budget
I tried to step through the repo and could not find any specific test database config. As such, my assumption is that the deployed database, included a test version as well, which was then persisted. Whether this is correct or not, seems would be a good pattern to not test directly on prod dbs but also use the same read-only creds. Questions, is this correct or am I missing something? How would we create, then deploy the test version of the db. Prior to s3 upload, or could this replication be part of devops process?
Team Housing
With housing using py.test, they supplied a pytest.ini which pointed to the test settings:
from .settings import *
DATABASES = {
"default": {
"ENGINE": "django.db.backends.sqlite3",
"NAME": ":memory:",
}
}
EMAIL_BACKEND = 'django.core.mail.backends.locmem.EmailBackend'
So we see using a sqllite db in the test environment. Not uncommon practice, but doesn't really test actual production database connection and services.
Team Homeless
It appears that they are using django's fixtures to provide some test data, but are not making a connection to an actual backend datastore. Similar to housing, a common pattern, but if we want to verify a functioning database connection with the db, then this does not actual accomplish this. If we are looking to test only the python code, is an acceptable option. https://github.com/hackoregon/teamHomelessness/blob/master/homelessAPI/homelessApp/tests.py#L6
Team Transportation
Guess you don't need a testing backend if you don't actually write any tests?
We didn't have any tests for Transportation ... the best guess as to what the final app looked like was the local development environment running on an Ubuntu 16.04.x LTS laptop. ;-)
https://github.com/hackoregon/transportation-backend/tree/master/ubuntu-local-deploy
In Budget’s case, we knew from the start that we would never write t the database, so it never occurred to me that testing against the production database would be a risk. (Only a risk if someone commits, and someone else merges, Django code that writes to the DB, but certainly a risk the greater the distance from those tribal assumptions.)
Not sure what the best strategy is here - duplicating the databases in production is a huge waste of memory for 99% of the time, but I agree that testing against a local sqlite3 doesn’t catch one of our biggest dependencies.
In theory we could use separate creds (test creds = read-only), but if anyone plans to write to their DB then we’re hosed.
In a monied organisation we’d just have a separate test/QA infrastructure, but I am loathe to spend that kind of money on behalf of an org that just recently asked for tax-deductible individual donations.
I agree in a tradeoff btwn budget and a "pristine" qa environment our budget is the priority. Mostly wanted to make this decision explicit and documented.
Environment Variable usage
In 2017 API projects, the following env vars were configured in each Travis repo:
Examination of configured Travis env vars
In the analysis below, nearly all findings were based on the team-budget
repo. Variations between projects should be accounted for as well, but rather than wait until I had the extra hours to review those as well, I'm posting this for others to build upon.
travis-docker-compose.yml
and local-docker-compose.yml
ecs-deploy.sh
as inherited environment variables - this is a script we cloned from (an AWS exemplar?) that is called by docker-push.sh
to pull a Docker image from AWS ECR (Elastic Container Registry) and deploy it to the appropriate AWS ECS (EC2 Container Service) serviceaws ecr get-login --region $AWS_DEFAULT_REGION
in the docker-push.sh
script - which is used to push a copy of the built Docker image from Travis to AWS ECR - but I’m only 80% sure that aws ecr get-login
uses the AWS creds to get a one-time login token that’s used by docker push
to authenticate to AWS ECR to push a new Docker imageecs-deploy.sh
as an inherited environment variable - this is a script used in docker-push.sh
to deploy a Docker image from our AWS ECR registry to the appropriate AWS ECS serviceproject_config.py
file, then re-building and re-deploying that project’s containertravis-docker-compose.yml
and local-docker-compose.yml
getconfig.sh
to download a copy of project_config.py
(which contains secrets configured as environment variables) into the Travis build environment (and which - the project_config.py
- become embedded in the built Docker image that gets deployed to ECS via ECR)project_config.py
secrets files for an “integration” (or “staging”) infrastructure and a “production” infrastructuretravis-docker-compose.yml
and local-docker-compose.yml
getconfig.sh
to download a copy of project_config.py
(which contains secrets configured as environment variables) into the Travis build environment (and which - the project_config.py
- become embedded in the built Docker image that gets deployed to ECS via ECR)docker-push.sh
as the “repo” name (in the DOCKER_REPO domain) to accomplish two things:
manage.py
and wsgi.py
to distinguish between runtime settings needed for non-Docker usage vs those needed for Docker usage; defaults to the dev.py
settings
dev.py
settings include SECRET_KEY, DEBUG, DATABASES {ENGINE, NAME, HOST, PORT, USER, PASSWORD} and settings for the debug_toolbarDockerfile
to enforce use of the production.py
settings
production.py
settings include SECRET_KEY, ALLOWED_HOSTS (and its companion EC2_PRIVATE_IP), DATABASES {ENGINE, NAME, HOST, PORT, USER, PASSWORD}from .. import project_config
travis-docker-compose.yml
and local-docker-compose.yml
docker_push.sh
build-proj.sh
, test-proj.sh
, start-proj.sh
project_config.py
: getconfig.sh
env.sh
to … … … (?)docker-push.sh
test-proj.sh
travis-docker-compose.yml
to tag the image, so that later docker push
command can find an image with the expected tag - otherwise docker push
will return "An image does not exist locally with the tag: 845828040396.dkr.ecr.us-west-2.amazonaws.com/production/transportation-systems-service"env.sh
to … … … (?)docker-push.sh
travis-docker-compose.yml
to to tag the image, so that later docker push
command can find an image with the expected tag - otherwise docker push
will return "An image does not exist locally with the tag: 845828040396.dkr.ecr.us-west-2.amazonaws.com/production/transportation-systems-service"docker-push.sh
docker-push.sh
Implicit Travis env vars
Implicit Docker env vars
Env vars unique to projects
emergency-response-backend
, declared in settings.py
- used to extend ALLOWED_HOSTSemergency-response-backend
, declared in travis-docker-compose.yml
as "False"teamhomelessness
, declared in docker-compose.yml
- used for ???emergency-response-backend
, declared in travis-docker-compose.yml
transportation-backend
, declared in settings.py
housing-backend
, declared in settings.py
- seems to be used to intentionally minimize the CORS permissions?housing-backend
, declared in settings.py
transportation-backend
, declared in Travis SettingsHard-coded environment variables
project_config.py
file (which contains secrets configured as environment variables) into the Travis build environment (and which become embedded in the built Docker image that gets deployed to ECS via ECR)team-budget
repo, we added an env.sh
script to make it easier on developers to set these to appropriate valuesteam-budget
repo), we hard-coded this env var in the -l
caseQUESTION (maybe just for myself): when passed through docker-compose.yml as env vars, are the passed-in env vars implicitly used by anything else other than the /bin/ scripts?
Proposal for Travis env vars
All other common env vars used in last year's Travis settings (excepting the DOCKER_USERNAME and DOCKER_PASSWORD used in transportation-backend
) are still valid and useful.
Travis configuration for builds
There are a number of basic settings in Travis that we use, in conjunction with communicated (tribal?) expectations, to enable Hack Oregon to get consistent builds and deploys:
emergency-response-backend
and housing-backend
These settings only work because we have configured the docker-push.sh
script to do the following:
# Tag, Push and Deploy only if it's not a pull request
if [ -z "$TRAVIS_PULL_REQUEST" ] || [ "$TRAVIS_PULL_REQUEST" == "false" ]; then
# Push only if we're testing the master branch
if [ "$TRAVIS_BRANCH" == "master" ]; then
This sets up a pattern of the following:
master
branch of the repoHow Travis hands off to AWS
This is due to the "magic" of the docker-push.sh
script e.g. team-budget:
export PATH=$PATH:$HOME/.local/bin
echo Getting the ECR login...
eval $(aws ecr get-login --region $AWS_DEFAULT_REGION)
echo Running docker push command... # Troubleshooting
docker push "$DOCKER_REPO"/"$DEPLOY_TARGET"/"$DOCKER_IMAGE":latest
echo Running ecs-deploy.sh script...
./$PROJ_SETTINGS_DIR/bin/ecs-deploy.sh \
-n "$ECS_SERVICE_NAME" \
-c "$ECS_CLUSTER" \
-i "$DOCKER_REPO"/"$DEPLOY_TARGET"/"$DOCKER_IMAGE":latest \
--timeout 300
There are four key actions here:
Breaking this down...
export PATH=$PATH:$HOME/.local/bin
IIRC, this is here to ensure that the aws
CLI (installed via .travis.yml
) is on the $PATH
eval $(aws ecr get-login --region $AWS_DEFAULT_REGION)
docker login
command.eval
immediately runs the docker login
command, without printing the time-limited password to the Travis log.docker-login
runs, subsequent commands such as docker push
and ecs-deploy.sh
will be authenticated to the default ECR registry (i.e. the registry corresponding to the AWS user & AWS_DEFAULT_REGION).docker push "$DOCKER_REPO"/"$DEPLOY_TARGET"/"$DOCKER_IMAGE":latest
This pushes the image that was just built in the Travis environment (by build-proj.sh
) up to the AWS ECS registry. IIUC, this pushes the $DOCKER_IMAGE to $DOCKER_REPO server into the $DEPLOY_TARGET repository, and apply the "latest" tag.
What mystifies me (despite great articles like this is whether there's an implicit docker tag
command having been run elsewhere in our stack, to have pre-tagged the image before we push it.
./$PROJ_SETTINGS_DIR/bin/ecs-deploy.sh \
-n "$ECS_SERVICE_NAME" \
-c "$ECS_CLUSTER" \
-i "$DOCKER_REPO"/"$DEPLOY_TARGET"/"$DOCKER_IMAGE":latest \
--timeout 300
This final script is a third-party script that enables Travis to tell AWS to pull a copy of the $DEPLOY_TARGET/$DOCKER_IMAGE:latest from $DOCKER_REPO and deploy it to the $ECS_SERVICE_NAME in $ECS_CLUSTER.
For example, for the team-budget
project from 2017, this will tell AWS to pull integration/budget-service:latest from 845828040396.dkr.ecr.us-west-2.amazonaws.com and deploy it to hacko-integration-BudgetService-16MVULLFXXIDZ-Service-1BKKDDHBU8RU4 on the hacko-integration cluster.
Travis output
When everything is successful, the Travis build log will display something like the following at the end of the log:
$ ./budget_proj/bin/docker-push.sh
Getting the ECR login...
Flag --email has been deprecated, will be removed in 1.13.
Login Succeeded
Running docker push command...
The push refers to a repository [845828040396.dkr.ecr.us-west-2.amazonaws.com/integration/budget-service]
Running ecs-deploy.sh script...
Using image name: 845828040396.dkr.ecr.us-west-2.amazonaws.com/integration/budget-service:latest
Current task definition: arn:aws:ecs:us-west-2:845828040396:task-definition/budget-service:121
New task definition: arn:aws:ecs:us-west-2:845828040396:task-definition/budget-service:122
Service updated successfully, new task definition running.
.travis.yml configuration
The configuration-in-common for all of last year's API projects' .travis.yml
is this:
sudo: required
services:
- docker
install:
- pip install --upgrade --user awscli
before_script:
- ./bin/getconfig.sh
script:
- './bin/test-proj.sh -t'
after_success:
- ./bin/docker-push.sh
(That is, except the emergency-response-backend
, which somehow skipped the before_script
step to get-config.sh
)
Two of the projects went much further and embedded a bunch of extra, undocumented setup work (that hopefully we can avoid in this year's projects) in the Travis setup:
docker-compose
(which I suspect is no longer necessary, if we rely on Travis' current containerized build environment)docker-compose
, and a commented-out call to build-test-proj.sh
(probably because there were no tests to run in the Transportation-backend project)I had the getconfig embedded into the other shell scripts on emergency response
On Sun, Apr 29, 2018, 12:12 PM Mike Lonergan notifications@github.com wrote:
.travis.yml configuration
The configuration-in-common for all of last year's API projects' .travis.yml is this:
sudo: required services:
- docker install:
- pip install --upgrade --user awscli before_script:
- ./bin/getconfig.sh script:
- './bin/test-proj.sh -t' after_success:
- ./bin/docker-push.sh
(That is, except the emergency-response-backend, which somehow skipped the before_script step to get-config.sh)
Two of the projects went much further and embedded a bunch of extra, undocumented setup work (that hopefully we can avoid in this year's projects) in the Travis setup:
- housing-backend https://github.com/hackoregon/housing-backend/blob/master/.travis.yml added exclusions for a couple of long-lived branches, and implemented an installation of a specific version of docker-compose (which I suspect is no longer necessary, if we rely on Travis' current containerized build environment)
- transportation-backend https://github.com/hackoregon/transportation-backend/blob/master/.travis.yml includes a now-commented-out section doing a similar installation of docker-compose, and a commented-out call to build-test-proj.sh (probably because there were no tests to run in the Transportation-backend project)
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/hackoregon/backend-examplar-2018/issues/51#issuecomment-385274079, or mute the thread https://github.com/notifications/unsubscribe-auth/AZWY9WoAuow5rUH35L-s9jl2nLpwjLAOks5tthCngaJpZM4TrMZO .
Comparing what you gave for team budget's .travis.yml to what he have currently in the exemplar, I can see three differences:
The script command line arguments are slightly different but seem to be semantically similar. They use -t
and -l
while our scripts use -p
and -d
. I believe our -p
- corresponds to their -t
.
Our repo separates what they have in test-proj.sh
in to two script files, build.sh
and test.sh
.
Our repo does not contain two of the scripts: docker-push.sh
(yet), or getconfig.sh
(probably never will)
To achieve the same level of travis behavior as e.g. team budget's backend, we could implement the following changes in the exemplar repo:
script:
section to call bin/build.sh -p
and bin/test.sh -p
getconfig.sh
from before_script:
stanzadocker-push.sh
Leaving us with a .travis.yml that looks like:
sudo: required
services:
- docker
install:
- pip install --upgrade --user awscli
script:
- ./bin/build.sh -p
- ./bin/test.sh -p
after_success:
- ./bin/docker-push.sh
Testing on the disaster-resilience-backend repo, Travis builds running with the config outlined in the post above seem to create the apiproduction docker image successfully, but one key problem is that the .env file is not there, so the `PRODUCTION` environment variables are not set, resulting in the following message:
...
$ ./bin/build.sh -p
WARNING: The PRODUCTION_POSTGRES_USER variable is not set. Defaulting to a blank string.
WARNING: The PRODUCTION_POSTGRES_NAME variable is not set. Defaulting to a blank string.
WARNING: The PRODUCTION_POSTGRES_HOST variable is not set. Defaulting to a blank string.
WARNING: The PRODUCTION_POSTGRES_PORT variable is not set. Defaulting to a blank string.
WARNING: The PRODUCTION_POSTGRES_PASSWORD variable is not set. Defaulting to a blank string.
WARNING: The PRODUCTION_DJANGO_SECRET_KEY variable is not set. Defaulting to a blank string.
Building api_production
...
production-docker-entrypoint.sh
to define the WSGItest.sh
and .travis.yml
to define the Docker Service named in production-docker-compose.yml
that hosts the API
Think it maybe valuable to moving forward taking sometime and documenting what we know about how each project was integrated with travis.
Things to look at:
Somehow we got every project moved through the chain, so should be able to point to some learnings.
I plan to take some time over weekend to look into this