wfau / gaia-dmp

Gaia data analysis platform
GNU General Public License v3.0
1 stars 5 forks source link

Proposals for CI platform #33

Open Zarquan opened 4 years ago

Zarquan commented 4 years ago

We need a Continuous Integration system that runs automated tests each time we create a new deployment. This is the first step along that route, to propose solutions and describe how they would work.

Given we are putting our code into GitHub the obvious one is the GitHub Actions CI system.

but there is a whole range of others to choose from as well

Question is .. how do we integrate the CI system with a deployment on an external OpenStack system that isn't part of the GitHub universe ?

stvoutsin commented 4 years ago

I've had a go creating a CI system for the project here: https://github.com/stvoutsin/zeppelinTo

Notes on how to setup a workflow can be seen here:https://github.com/stvoutsin/aglais/commit/11e42b384017006ac29270ce54a2ddac9e5df0f6

And results of the unit tests which are run on each commit can be found here: https://github.com/stvoutsin/zeppelinTo/actions

Setting up a CI system for deploying a Spark/K8s cluster will be a bit more complicated obviously, will have to think about what are unittests are, how we deploy the cluster in an automated way (Ansible, Dockerfiles etc..) and how we trigger the unittests to run something like Spark jobs on the cluster we create.

Zarquan commented 4 years ago

Setting up a CI system for deploying a Spark/K8s cluster will be a bit more complicated obviously, will have to think about what are unittests are, how we deploy the cluster in an automated way (Ansible, Dockerfiles etc..) and how we trigger the unittests to run something like Spark jobs on the cluster we create.

Tools like ZeppelinTo will be a useful component in the test infrastructure.

If all of the components, OpenStack, Magnum, Kubernetes, HDFS, YARN, MapReduce, Spark and Zeppelin either have command line clients or REST endpoints, then we can gradually work towards an automated test system by adding tests to each layer as we build it..

It might be that the best way of developing the system is as a set of smaller components developed as separate GitHub ~projects~ repositories, each of which has it's own test framework and unit tests. We then put them together to deploy a full system, which then another set of integration tests. See #34

Zarquan commented 4 years ago

We can link a self hosted test runner to the GitHub triggers, but there are security issues with doing that. https://help.github.com/en/actions/hosting-your-own-runners/adding-self-hosted-runners