ualbertalib / jupiter

Jupiter is a University of Alberta Libraries-based initiative to create a sustainable and extensible digital asset management system. This is phase 2 (Digitization).
https://era.library.ualberta.ca/
MIT License
23 stars 10 forks source link

[Spike] Investigate Jupiter on Kubernetes #2568

Open murny opened 2 years ago

murny commented 2 years ago

Kubernetes - https://kubernetes.io

Spike Goal

Replace previously manual VM management completely with declarative service definitions and automated orchestration.

Spike questions

More info: https://gist.github.com/mbarnett/711bb6e19907350661db573ff2227b01

murny commented 2 years ago

Spike questions and answers

Documentation and investigation into what Kubernetes is and it's concepts. How can we train rest of the developers to know enough about Kubernetes to be able to deploy and troubleshoot problems?

Tons of tutorials online. Great place to start is the official docs and tutorials from https://kubernetes.io (eg: https://kubernetes.io/docs/home/)

Since we also using Azure, Microsoft has a ton of tutorials and little courses you can take for free like this one: https://docs.microsoft.com/en-us/learn/modules/aks-deploy-container-app/

What does Jupiter need to do in order to run in Kubernetes?

Jupiter doesn't need to change too much itself. It will be a good idea to providing healthchecks for rails and sidekiq as this will greatly improve how Kubernetes operates (you don't want your containers to start receiving requests when they are not finished initializing for example). Otherwise we don't need to change too much with the application itself.

Does it need a new DockerFile? For Production yes, we will need to improve our Dockerfile with some additions (FITS/FFMPEG/ClamAV and anything other features we currently have in production).

What would a working Kubernetes Manifest look like for our stack?

We will essentially have 4 or so manifests:

What does Jupiter on Kubernetes look like on Azure? Basically theirs three areas our team will be keeping tabs on when we living in the cloud:

What does deployments look like from the cloud? For deployments any developer can just run a kubectl command to do a no downtime rolling deployment.

Most likely we will have a Github Action that will build a new image of our containers anytime new code hits ours master branch (believe we currently do this for UAT already but can also look at my Demo for an example).

Then when a developer wants to deploy they can simply just do:

kubectl rollout restart deployment/jupiter-app -n jupiter
kubectl rollout restart deployment/jupiter-worker -n jupiter

This will create a new pod for both our app and workers (which will grab the latest image we have created) and spin up a new container. Then it will move traffic over to these new pods and remove the old pods. All without any downtime for our users.

This could easily be automated as well via a Github Action.

If you wanted to deploy a specific version of an image (say we still cut releases) you can set the new image version like so:

# Rolling update "jupiter" containers of "app" deployment, updating the image
kubectl set image deployment/jupiter-app jupiter-app=jupiter:v2.0.2
kubectl set image deployment/jupiter-worker jupiter-worker=jupiter:v2.0.2

# Think you can combine these too? Needs verifcation
kubectl set image deployments *=jupiter:v2.0.2

To rollback a deployment?


# Check the history of deployments including the revision 
kubectl rollout history deployment/jupiter-app           

# Rollback to the previous deployment
kubectl rollout undo deployment/jupiter-app                     

# OR
# Rollback to a specific revision
# kubectl rollout undo deployment/jupiter-app --to-revision=2        

# Watch rolling update status of "jupiter-app" deployment until completion
kubectl rollout status -w deployment/jupiter-app                    
pgwillia commented 2 years ago

You mentioned Helm here and in Teraform Spike answers. What is it?

How would you spin up a UAT environment vs a production one? What kind of safe guards should we think about building into production deployments? -- Maybe that's a Teraform spike question?

Is there an easy way to have production like/equivalent data when creating a UAT environment?

murny commented 2 years ago

Thanks for the followup questions!

You mentioned Helm here and in Terraform Spike answers. What is it?

Helm is essentially a package manager (think Bundle/Yarn but for kubernetes). So it's a way to use community made kubernetes containers/etc without us having to reinvent the wheel. So for NGINX Ingress, we are using the helm chart for this (https://artifacthub.io/packages/helm/ingress-nginx/ingress-nginx). For quickly getting a Solr pod up and running for Jupiter we may want to entertain a Helm chart as well (then we can look into Zookeeper/doing it ourselves down the road)

How would you spin up a UAT environment vs a production one? What kind of safe guards should we think about building into production deployments? -- Maybe that's a Teraform spike question?

Yeah this probably more Terraform related but I'll answer it here. I think essentially we treat "UAT" as a staging environment. It should almost be a mirror of production as much as it can be. You can have Terraform take in "variables" which depending on these variables, the provisioning steps could change. So maybe instead of the highest tier Postgres DB that Azure offers, for UAT/Staging, you may just want to opt in for a smaller tier Postgres DB, etc. Which is easily configurable. But I think for the most part they will basically be the same or as close to the same as possible? This will give us confidence when we go to production. And anything that changes in production, should be done via terraform so staging/uat or any new environment gets those changes for free. Everything should be automated, no more manual task for provisioning the different servers.

Is there an easy way to have production like/equivalent data when creating a UAT environment?

Could be where our seed data comes in? You always have access to the pods to run commands like so:

kubectl exec jupiter-app-df869766d-nf4xj -n jupiter --kubeconfig kubeconfig -- bin/rails db:seeds

Or could look in a way to take a nightly backup dump from Azure (as Production should be backing up regularly) and import that into on the staging/uat server etc?

jefferya commented 2 years ago

Helm

Would a tour of a Helm chart and how to use as a templating tool for k8s deployment manifests be useful? CWRC is using Helm as a means to generalize a k8s deployment manifest for production, staging, uat/review environments within a ci/cd pipeline.

Is there an easy way to have production like/equivalent data when creating a UAT environment?

Aside from seed data, another idea coming from a similar use-case in my CWRC position: