In this project, students will apply the skills they have acquired in the Establish a Foundation in Observability course to configure a monitoring software stack to collect and display a variety of metrics for commonly used AWS resources which include EC2 and EKS. Additionally, students will establish and configure rules for alerting and set parameters to be notified prior to the occurence of failures within the aformentioned cloud resources.
Students will also have the opportunity to test and observe thier own implentation of the monitoring software stack to apply and showcase SRE methodologies and practices which can be transferred to real-world scenarios.
git clone <repo url>
.us-east-1
. Open the CloudShell by clicking the little shell icon in the toolbar at the top near the search box. Copy the AMI to your account
Restore image
aws ec2 create-restore-image-task --object-key ami-08dff635fabae32e7.bin --bucket udacity-srend --name "udacity-<your_name>"
Take note of that AMI ID the script just output. Copy the AMI to us-east-2
:
aws ec2 copy-image --source-image-id <your-ami-id-from-above> --source-region us-east-1 --region us-east-2 --name "udacity-nanderson"
Make note of the ami output from the above 2 commands. You'll need to put this in the ec2.tf
file.
Create a private key pair for your EC2 instance called udacity
Use the terraform files to provision each of the resources in AWS; it will take a few minutes to complete. Once the script is complete, you can go to the AWS and look for the the newly created resources in the EKS and EC2 areas.
SSH into the EC2 instance with username ubuntu
and the udacity key created in a previous step.
Install the node exporter on the EC2 instance. Don't forget to allow traffic on port 9100.
public-ip, username, email
then open the collection runner, choose the collection and environment, then Run the project. You should see successful responses for each of the API endpoints in the collection.token
variable, and paste it somewhere safe, you will need it later.Here are two examples of successful responses for /init
and authorize/user
endpoints:
/init
{
"dataset": {
"created": "Day, DD MM YYYY HH:MM:SS TZD",
"description": "initialize the DB",
"id": 1,
"location": "home",
"name": "init db"
},
"status": {
"message": "101: Created.",
"records": 1,
"success": true
}
}
/authorize/user
{
"dataset": {
"created": "<date-time>",
"email": "<email>",
"id": 1,
"role": 0,
"token": "<token>",
"username": "<username>"
},
"status": {
"message": "101: Created.",
"records": 1,
"success": true
}
}
From this point forward, you will not need to Initialize the Database or Register a User.
kube config file
by running:
aws eks --region <region> update-kubeconfig --name <cluster-name>
e.g. aws eks --region us-east-2 update-kubeconfig --name udacity-cluster
monitoring
prometheus-additional.yaml
file and set the targets
accordingly for both prometheus, and blackbox.values.yaml
(near line 2310) so that it matches:
additionalScrapeConfigsSecret:
enabled: true
name: additional-scrape-configs
key: prometheus-additional.yaml
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
-f "path\to\values.yaml" --namespace monitoring
.user: admin & password: prom-operator
.instance:node_cpu:rate:sum
query.node_memory_MemAvailable_bytes
in the prometheus query.node_disk_io_now
in the prometheus query.instance:node_network_receive_bytes:rate:sum
in the prometheus query.Make sure you have the token from earlier (obtained from the API), then in blackbox-values.yaml
add the following starting at line 112:
valid_status_codes:
- 200
# - 401
# - 403
bearer_token: <YOUR_TOKEN>
Save it, then install blackbox in the Kubernetes cluster. You will need to include -f "path\to\blackbox-values.yaml" --namespace monitoring
In Grafana, import dashboard 7587.
probe_http_status_code
for the prometheus query.sudo systemctl status node_exporter
Example_Flask_API Prometheus Stack