Closed tschaffter closed 4 years ago
Hi @kimyen, it would be great if we could work together to enable CI for this project with automatic deployment of the current build. We could start working on this together as soon as we have finished identifying the tasks for the pilot phase (from now to mid-June) and received the green light from Bruce that you could continue supporting this project. I'll create a JIRA ticket at that time for this feature.
We're not quite at the point of 'optimizing' our CI build yet, but I came across this article and thought it might come in handy later: https://testdriven.io/blog/faster-ci-builds-with-docker-cache/
@lukasz-rakoczy Hi Lukasz, could you provide some guidelines regarding how you would enable CI for this project? We can start first with a high level description and then later decide how to implement it. Here are some information that may be useful:
.travis.yml
that was created when the project was initially generated using generator-angular-fullstack. .travis.yml
has not really been updated since the generation of the project, whose code has been deeply modified since. Travis currently reports that tests fail.npm run test
. Running npm run test
completes successfully with the current version of the collaboration portal (I usually always run this command manually before making a git commit
).Thanks!
@tschaffter Hi Thomas,
I think we have a couple of options here and the approach should be driven by the requirements we want to satisfy. First of all we should choose a tool that we want to use. In my opinion we can consider continuing using Travis CI or switch to Jenkins which is already used in the other streams of the PHC initiative. The second thing to consider is a deployment model which can important when we want to integrate the tool into the deployment process. Deployment however can be tread as a further step and we can focus on starting CI for the portal.
Here are my thoughts on different aspects of the tools and processes:
Travis CI:
Jenkins:
General:
In my opinion we should start with bring back the Travis build to live and then improve from there. If you could grant me edit rights to the repo I could try to create a pull request fixing the Travis build.
Thanks @lukasz-rakoczy for your comprehensive feedback!
It seems that continuing to use Travis is the most suitable option.
Do we have a paid version of the service
Yes
it seems that we should start from using a Chrome addon instead of installing custom apt packages
Correct. CI tests started failing when I decided to perform e2e test using ChromeHeadless instead of PhantomJS. Soon I found that the best way to use Chrome/ChromeHeadless in test was to use puppeteer.
I have fixed .travis.yml
and CI tests run successfully now. I have also added Travis status image to the README.md
.
I wanted to experiment with this on a new branch in the repo but I don't have edit access to it. I saw that you forked the repository. It's the way to go, then you can do pull requests that I will review before merging.
What would be the steps that we should take if we want to have the collaboration portal deployed somewhere automatically after the test successfully complete?
Hi Thomas,
If we want to keep things simple (by still using Docker Compose to run the app) it would be:
From here it gets a bit more complicated because there are different options:
Of course things get more complicated when if we want take into account app versioning, multiple environments... Maybe it is better to start with smaller steps and try to create a pipeline that would would deploy code from the develop branch into a develop EC2 instance.
Please let me know if you want me to help you with that.
@lukasz-rakoczy Can you take ownership of this task?
We now have a working Travis script. What we need now is a way to automatically deploy the collaboration portal environment when the different services of the collaboration portal are updated? Maybe at the same occasion you could propose a protocol that we would follow to update the portal environment when there will be a production release (e.g. deployment of weekly updates while users are using the portal, backup data, etc.).
@tschaffter Yes - you can assign this task to me.
Once #126 is fixed we can extend the Travis config to automatically build and push images after changes are pushed to the Github repo (it would be good to agree on some flow so we know what branches are used for different deployment stages; I think that gitflow works fine - https://datasift.github.io/gitflow/IntroducingGitFlow.html).
Then I would try to setup AWS Codedeploy so the new images can be pulled on EC2 instances and the app can run in the updated version.
According to the production release. I think that AWS Codedeply can also be used to automatize steps like backups. However this depends on how the production setup will look like (own mongo instance or managed AWS service?, simple EC2s or Kubernetes).
@lukasz-rakoczy @ychae Sage is still trying to re-enable Travis test when commit are submitted to this repos. Sage is a non-profit and it has been confirmed by Travis that we don't need to pay anything to use the paid features.
@lukasz-rakoczy Meanwhile, we can still work on this task. Can you put together a plan of actions to enable auto-deployment of the collaboration portal and its dependencies (e.g. prov-service, etc.)?
@tschaffter @ychae I can confirm that Travis started to build the project so the licencing issue is fixed.
@tschaffter First of all it would be good to fix all failing tests so Travis can successfully finish the build.
According to what Kumar said on the meeting yesterday we should focus on publishing Docker images for the components so they can be used when deploying the system. Kumar suggested that the final infrastructure for the system is not fixed yet and going beyond publishing the images could be a waste of time.
I can't create branches in the original collaboration portal repo so I forked it here https://github.com/lukasz-rakoczy/PHCCollaborationPortal (you should have access). I modified the Travis file inside so it builds a Docker image with the development version of the component and publishes it to a Docker registry. This configuration requires 3 env. variables to be set in the Travis plan settings:
IMAGE_NAME - name of the image to be used for the component (can be prefixed with docker registry address) REGISTRY_PASS - Docker registry password REGISTRY_USER - Docker registry username
For testing purposes I use my own Docker hub credentials and repository. We need to find a proper place to store the component images. The registry should be private (so images can be only accessed by authorized people) and it should be accessible from Travis. Do you know if Sage has its Docker hub account (on a paid plan) or some other Docker registry accessible from the internet?. If yes then we could use for storing the images. If not we could also use AWS ECR or Google Container Registry for that but in this case we need to setup these services (on either Sage or Roche accounts).
After we have the registry configured we need decide on how the images should be versioned. With the development branch it should not be a problem but for "production" releases this can be more complicated (depending on the requirements).
The same approach can be used for the other system components (prov service...). Once all the images are in the registry they can be used for the system deployments but I think we need to discuss further steps with Kumar.
We also need to make sure that current Dockerfiles are correct. By this I mean that they contain everything required to run the component but do not contain unnecessary artifacts that make the images large.
@lukasz-rakoczy Sage does have a private Docker Hub account. @jaeddy can give you access to the registry that has the image currently so that you can continue to work on this.
@lukasz-rakoczy
First of all it would be good to fix all failing tests so Travis can successfully finish the build.
Please use the last commit that successfully passed travis test. In the near future, I'll start pushing update the the master branch, which we will configure to trigger the auto-deployment.
I'm currently swamped with other tasks for this project. Please let me know if you need me to perform a specific action.
@tschaffter
I could set up the CI/CD pipeline but I need to have access to AWS Account from which the dev instance is running (to configure required roles and users for AWS Codedeploy and Travis), Github (to create pull request with changes to travis.yml) and Docker hub (to configure repo for storing the images).
Currently what is in the private branch I cloned is pushing a build image to my private repo but this can be easily adjusted to the Sage Docker account but to go further I need the permissions above or you need to follow the convention and configure it by your own.
Fixing test errors is not crucial to create the pipeline (for testing purposes test execution can be excluded from the pipeline).
@lukasz-rakoczy I did some clean up and fixed the unit tests. 68b2c991fa891c9df46e2ed994a09638771b4a44 passed the tests on Travis.
I could set up the CI/CD pipeline but I need to have access to AWS Account from which the dev instance is running (to configure required roles and users for AWS Codedeploy and Travis), Github (to create pull request with changes to travis.yml) and Docker hub (to configure repo for storing the images).
@lukasz-rakoczy Can you provide a description / diagram of the CI/CD pipeline first? I will then instantiate the resources required (EC2, ?).
Currently what is in the private branch I cloned is pushing a build image to my private repo but this can be easily adjusted to the Sage Docker account but to go further I need the permissions above or you need to follow the convention and configure it by your own.
What are the conventions?
Fixing test errors is not crucial to create the pipeline (for testing purposes test execution can be excluded from the pipeline).
Commit 68b2c991fa891c9df46e2ed994a09638771b4a44 is a relatively stable version. This version is currently deployed on http://test.phc.sagesandbox.org.
@tschaffter
Can you provide a description / diagram of the CI/CD pipeline first? I will then instantiate the resources required (EC2, ?).
I think that we can try one of the following two approaches to set this up:
Number 2. is a bit simpler (it does not require Docker registry account) but number 2. is more flexible - when you have images in a registry you can reuse them for different deployments (automated but also manual).
To set this up we need:
What are the conventions?
Please look at this travis file: https://github.com/lukasz-rakoczy/PHCCollaborationPortal/blob/develop/.travis.yml
If you provide:
$REGISTRY_USER - Docker registry username
$REGISTRY_PASS - Docker registry password
$IMAGE_NAME - Docker image name (in my case it is "code4life/phc-cp" because my Docker hub account name is code4life and I created a repo named phc-cp)
environment variables to your Travis plan every commit to the develop branch will push a new version of Docker image to this repo and tag with two tags - :latest and :GIT-commit-hash.
@lukasz-rakoczy
Let's go with Number 1 because it's more flexible. We will be using Synapse as a docker repository.
Synapse project: synapse.org/phc_collaboration_portal
Docker images pushed to this project must have the prefix docker.synapse.org/syn18489221/
, for example docker.synapse.org/syn18489221/phc-collaboration-portal
, docker.synapse.org/syn18489221/prov-service
.
docker login -u <synapse username> -p <synapse password> docker.synapse.org
docker push docker.synapse.org/syn18489221/phc-collaboration-portal
I have created the Synapse user phccp-autodeploy
and have given him write access to this Synapse project.
I have instantiated an EC2 instance to host the deployment agent (ec2-35-164-244-178.us-west-2.compute.amazonaws.com). There have created two accounts: phccp
which should be used to setup the agent, and lukasz
to connect to the EC2. I'll send you an SSH private key shortly. I have also installed docker. I have added lukasz
to the groups sudo
and docker
. I have added the user phccp
to the group docker
.
When logged in as phccp
, you can do docker login docker.synapse.org
and push/pull images from there.
Also, I guess that we would like a CloudFormation script that automatically creates AWS Roles, instantiates the EC2 and setup it. Is this something you would be able to develop?
Thanks!
@lukasz-rakoczy Will the autodeploy agent also detect when other services such as prov-service
are updated on GitHub and trigger the restart of the stack?
Also, I propose to deploy based on the develop
branch and keep master
for thoroughly tested build.
@lukasz-rakoczy I have created the GitHub user phccp-autodeploy
and have giving it READ access to https://github.com/Sage-Bionetworks/PHCCollaborationPortal. This idea is to use it on the EC2 to pull the script from GitHub to deploy the full stack. Let me know when you need its password/API key.
Hi @tschaffter
Unfortunately I was not able to shh into the EC2 machine with the key you provided me but anyway I think there is more to be done to make the pipeline working.
With my own accounts (AWS, Github, Travis) I created a pipeline that works and we can reuse its elements to automatically deploy PHC-CP. Everything I have created is here: https://github.com/lukasz-rakoczy/codedeploy
The idea is to:
There a couple a things we need to set up:
All these steps are not complicated but it would be much easier if we could setup a call go through them together because this ticket based communication takes too long. I think that in 1h we would have everything running. If you have some time today or tomorrow please let me know (by email or setup a meeting in my calendar). I can call in later (10-11AM your time) to have this done.
Hi @lukasz-rakoczy I'll set up a time for a call for all of us to get this sorted. Thanks so much for all the details!
@lukasz-rakoczy @ychae I met with Sage IT team and we made good progress:
AWSCodeDeployFullAccess
.What remains to be done:
develop
and master
(we will have ultimately two machines, one serving the master
build and another one serving the develop
branch).ACCESS_KEY_ID
and SECRET_ACCESS_KEY
values to Travis settingsKhai requested changes to the creation of the AWS user, which I accepted. However, this is preventing me to create the CodeDeploy application in AWS. Meeting with Khai in ~30 min.
@lukasz-rakoczy We have created the following user and role to run the CodeDeploy application.
User:
PhccpServiceUser:
Type: 'AWS::IAM::User'
PhccpServiceUserAccessKey:
Type: 'AWS::IAM::AccessKey'
Properties:
UserName: !Ref PhccpServiceUser
Role assumed by PhccpServiceUser:
CodeDeployServiceRole:
Type: "AWS::IAM::Role"
Properties:
AssumeRolePolicyDocument:
Version: "2012-10-17"
Statement:
-
Effect: "Allow"
Principal:
AWS:
- !GetAtt PhccpServiceUser.Arn
- !GetAtt AWSIAMThomasSchaffterUser.Arn
Action:
- "sts:AssumeRole"
Path: "/"
ManagedPolicyArns:
- arn:aws:iam::aws:policy/AWSCodeDeployFullAccess
The AWS CodeDeploy application has not yet been created because we haven't identified yet all the permissions required to do so. I may ask Sage engineers to create the CodeDeploy app and deployment group using their elevated privileges as we did for the instantiations of resources using the CloudFormation script https://github.com/Sage-Bionetworks/phccp-autodeploy/blob/master/cf/cf-stack.yml.
@lukasz-rakoczy Would you be able to write a CF script does what you have shown me manually on Thursday, that is the creation of the CodeDeploy app and deployment group configured to work with the other resources that we have created?
Closed by mistake
Update: @lukasz-rakoczy and I just met.
Hi @tschaffter,
I've been able to make some progress with configuring the deployment stack.
Here https://github.com/lukasz-rakoczy/codedeploy/blob/master/cf/cf-deploy.yml you can find complete AWS stack (IAM resources, EC2, CodeDeploy resources, S3) which allows to automatically deploy collaboration portal (including all its components). I'm not sure if you will be able to create the stack with your Sage AWS privileges but on my account it is working fine.
I also updated: https://github.com/lukasz-rakoczy/codedeploy/blob/master/.travis.yml so deployment is divided into 2 steps:
I also implemented a little workaround that will allow us to deliver Docker registry password to CodeDeploy agents so they can log into the private Docker registry. It is not perfect because the registry credentials are stored on S3 (private and accessible by only authorized AWS users). The better solution would be to store the secret in AWS SecretsManager but I'm afraid we would have problems with setting this up with your AWS privileges.
I'm not sure if you were able to make any progress with Sage engineers regarding you account privileges but we can set-up a meeting to move this issue forward.
We have recently achieved this. Further improvements will be tracked in separate tickets.
Notes from initial discussion with Kim
prod server: