felvin-search / docs

Docs at Felvin
https://docs.felvin.com
8 stars 4 forks source link

Create a document on how to deploy the felvin.com website #35

Closed hargup closed 2 years ago

hargup commented 2 years ago

Create a step by step how to guide on how our deployment process works. Ideally should be created as a checklist following which someone should be able to take the felvin.com code and deploy it to a new sub domain (say beta.felvin.com) on a new K8s cluster.

Test that the document is working:

hargup commented 2 years ago

@OrkoHunter how's work going on for this?

OrkoHunter commented 2 years ago

Ongoing - should have a doc in couple of hours

OrkoHunter commented 2 years ago

Tutorial

NOTE: If you are fairly new to Kubernetes, it should take at least 1-2 days of learning the new concepts as well as playing/experimenting with a cluster and gaining confidence.

Goal: Deploy a new felvin.com website using Kubernetes on AWS

0. Prerequisite

Kubernetes.io has a nice tutorial which uses a minikube cluster (kubernetes cluster on your laptop) and teaches the basic concepts needed for our purposes. See https://kubernetes.io/docs/tutorials/kubernetes-basics

After the tutorial, you should be able to understand the following Kubernetes resources

1. Create a Kubernetes cluster

NOTE: To avoid polluting VPC networks (as they have a max quote per region), we are going to use the ap-south-1 region.

Follow the AWS’s tutorial to create a new Kubernetes cluster on EKS. EKS is a service provided by AWS which they call “managed Kubernetes cluster”. Google Cloud and other providers do the same. See what is EKS.

Steps: https://docs.aws.amazon.com/eks/latest/userguide/getting-started-eksctl.html

NOTE: Use Managed nodes instead of Fargate. Managed nodes use actual EC2 instances. Fargate is serverless and does not provide all the controls we should ideally have e.g. sshing into the server and looking around.

NOTE: Make sure your laptop has the right aws credentials. Install the aws cli https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html and run aws configure https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-quickstart.html .

This will take several minutes (~20). You can see your new cluster in the EKS dashboard. https://console.aws.amazon.com/eks/home

At this point you should be able to see all the system pods running in the cluster. Pods like core-dns are used for internal service discovery, aws-node are used for storing AWS IAM groups who have authorization to access the cluster, etc.

2. Create an ECR repository (to store docker images)

We need an internal repository where we can store our Docker images. We will then use this image to create a kubernetes Deployment.

Go to https://ap-south-1.console.aws.amazon.com/ecr/repositories?region=ap-south-1 and create a new private repository.

Note the URI of the repository e.g. <id>.dkr.ecr.ap-south-1.amazonaws.com/<repository_name>, we’ll need it later.

3. Create and push the image

Note: Make sure you have Docker running.

Clone the https://github.com/felvin-search/felvin.com repository and run yarn install. Now create a docker image by running

docker build -t <account-id>.dkr.ecr.ap-south-1.amazonaws.com/<repository_name>:latest -f docker/Dockerfile .

Now authenticate against the ECR repository by using aws command here https://docs.aws.amazon.com/AmazonECR/latest/userguide/getting-started-cli.html#cli-authenticate-registry

Let’s push the image! Check if you can see the image in the repository using the AWS console.

docker push <account-id>.dkr.ecr.ap-south-1.amazonaws.com/<repository_name>:latest

4. Create required Secrets object for environment variables

Before we create the deployment, we need to ensure that the environment variables the pods need are ready.

Follow the steps at https://developers.google.com/custom-search/v1/introduction to get a google search api key and a context key. Once you have them, run

kubectl create secret generic google-keys --from-literal=GOOGLE_SEARCH_API_KEY=XXXX --from-literal=GOOGLE_SEARCH_CONTEXT_KEY=XXXX

Note: Make sure to toggle "search the whole web" from the Google search console app settings to enable searching the whole web instead of custom websites.

5. Create a deployment

Now let’s create a kubernetes deployment using the docker image we pushed.

Note: This is the first time we are interacting with the cluster manually. The two main ways of doing anything are kubectl <command chain> and kubectl apply -f <a yaml file>. Using a YAML file is good for creating things declaratively, so that they can be shared later.

Update the image url in the deploy/deployment-production.yaml file with <account-id>.dkr.ecr.ap-south-1.amazonaws.com/<repository_name>:latest.

Now run

kubectl apply -f deploy/deployment-production.yaml

Protip - Run k9s in a separate terminal and see the new pods getting created live.

In a few minutes, the new pods corresponding to the deployment should be up and running.

Protip: Use kubectl get events --sort-by='.metadata.creationTimestamp' -A to see logs from the Cluster.

Note: If there are any errors, you can also delete the deployment using kubectl delete -f deploy/deployment-production.yaml

Few notes on the YAML file

Protip - We can get any Kubernetes resource (e.g. deployment) declaratively in a YAML file as well. (Or a JSON. Kubernetes as a REST API for interacting with the cluster). Try running kubectl get deployment/felvin-search-prod -o yaml

6. Port forward to a pod

If you see the new pods for the deployment and their status as “Running” in the k9s window, let’s see if the pods are really ready!

On k9s, use shift+f to port forward to one of the two pods we created. The container port is 80 and you should use 3000 on your localhost (to escape CORS errors).

Now if you open http://localhost:3000 you should be able to see the site up and running!

Protip: You can look at the logs from both the pods by using kubectl logs -f deployment/felvin-search-prod

7. Create a service

We will now create a Kubernetes service to start accepting traffic into our deployment pods. Kubernetes has a nice expose command to create a service which exposes a deployment. Run

kubectl expose deployment felvin-search-prod --type=LoadBalancer --name=felvin-search

This is going to create a service of type LoadBalancer called felvin-search which will be able to accept traffic and distribute it to the pods of our felvin-search-prod deployment.

Now run kubectl get service/felvin-search -o wide and you should be able to see a public URL in the EXTERNAL-IP column.

The AWS LoadBalancer (aka ALB) can also be seen in the AWS console.

Protip: As always, you can also get the full YAML file for this service for future reference kubectl get service/felvin-search -o yaml

8. Set a CNAME record pointing to the host

If you want the site to be visible at felvin.com, set a CNAME record pointing to the domain for the service.

9. Delete everything

Let’s clean up everything!

FAQs

What happens when a new PR is merged or a new commit is pushed in felvin.com?

How to configure what EC2 instance size is used for the cluster nodes and how many of them?

AWS creates a “Node group” to manage the type of nodes used for the kubernetes cluster. If you open the Cluster page on AWS, https://console.aws.amazon.com/eks/home?region=us-east-1#/clusters/felvin-cluster you can see the node group. If you open the node group and click “Edit”, you can see the minimum and maximum number of nodes set in the config. Kubernetes will use this to spin up more nodes if needed for scalability reasons.

How to grant cluster access to more people so that they can run kubectl commands?

Update the names and/or roles in https://github.com/felvin-search/felvin.com/blob/master/deploy/aws-auth-kube-system.yaml and run kubectl apply -f deploy/aws-auth-kube-system.yaml

Kubectl commands cheatsheet

Full https://kubernetes.io/docs/reference/kubectl/cheatsheet

N-Shar-ma commented 2 years ago

@OrkoHunter I tried following along with the tutorial and was able to do so smoothly uptil step 3. In step 4, I was able to get the api key (by clicking on the big blue 'Get a Key' button), but I don't see any way to get the context key. Also, I don't see any "search the whole web" toggle. Could you please elaborate on how to get the google search credentials properly? Maybe I need access to a company account rather than a personal account?

This is how the console looks for me right now:

image

N-Shar-ma commented 2 years ago

@OrkoHunter @hargup The previous issue has been resolved, I only had to realise that the cx parameter or the search engine id is what you're referring to as the context key. Now however, I'm having trouble getting the pods to run on deployment. I tried deleting and recreating the deployment but the error (CrashLoopBackOff) persists. This is how my k9s terminal looks:

image

And this is the change made in the deployment-production.yaml file as instructed in step 5:

image

OrkoHunter commented 2 years ago

Hey @N-Shar-ma, can you please check/share some logs from the cluster kubectl get events --sort-by='.metadata.creationTimestamp' -A?

(Let's maybe switch to the private-maintainers channel on Discord for this, as it could get too secret-y?)

(Edit: Aah yes, the context key was the cx parameter. It's also available in the URL of your application in the Google Search console)

N-Shar-ma commented 2 years ago

@OrkoHunter I was able to follow along with this guide quite easily, even without prior experience with kubernetes or aws. Here's the deployment: https://beta.felvin.com/

Only suggestions would be to:

Do let me know when I should delete everything (step 9)

hargup commented 2 years ago

@N-Shar-ma The beta website looks great! You can delete the cluster now.

hargup commented 2 years ago

@N-Shar-ma can put the tutorial on docs.felvin.com. You can create a new section for "tutorials" under "team docs".

You can make the suggested changes in the new page, me and Himanshu can review that.

OrkoHunter commented 2 years ago

@N-Shar-ma Glad to hear about your experience! :)

hargup commented 2 years ago

This was done.