OCI Kubernetes Monitoring Solution is a turn-key Kubernetes monitoring and management package based on OCI Logging Analytics cloud service, OCI Monitoring, OCI Management Agent and Fluentd.
It enables DevOps, Cloud Admins, Developers, and Sysadmins to
across their entire environment - using Logs, Metrics, and Object metadata.
It does extensive enrichment of logs, metrics and object information to enable cross correlation across entities from different tiers in OCI Logging Analytics. A collection of dashboards is provided to get users started quickly.
:stop_sign: Upgrading to a major version (like 2.x to 3.x)? See upgrade section below for details. :warning:
OCI Dynamic Groups, User Group and Policies.
Deployment Method | Supported Environments | Solution UI | Dashboards | Customisations | Comments |
---|---|---|---|---|---|
OCI Logging Analytics Connect Cluster | OKE*** | :heavy_check_mark: | Manual | Partial Control (Recommended) | Customisations are possible through Helm once deployed using Logging Analytics Connect Cluster flow from Console, which is applicable for both Automatic and Manual Deployment modes. We recommend choosing Manual Deployment mode for OKE clusters with Private API Server endpoint, as support for the automatic deployment for the same would be available soon. |
Helm | All* | :heavy_check_mark:** | Manual | Full Control (Recommended) | |
OCI Resource Manager | OKE | :heavy_check_mark:** | :heavy_check_mark: | Partial Control | Customisations are possible through Helm once deployed using OCI Resource Manager . |
Terraform | OKE | :heavy_check_mark:** | :heavy_check_mark: | Partial Control | Customisations are possible through Helm once deployed using Terraform . |
kubectl | All* | :heavy_check_mark:** | Manual | Full Control (Not recommended) |
* For some environments, modification of the configuration may be required.
** Solution UI experience including Topology and other visualisations are available for customers deploying the solution using methods other than OCI Logging Analytics Connect Cluster
, only if some additional steps are followed as mentioned in their individual sections.
*** Connect cluster support for EKS and clusters other than OKE (partially automated flow) would be available soon. Meanwhile, if you would like to experience the Solution for EKS, use helm or other deployment methods.
This newly launched UI based workflow from Logging Analytics Console is the recommended approach to start enabling Kubernetes Monitoring Solution for your OKE clusters. In this approach, you would go through a guided flow to enable the monitoring. It has support for both Automatic and Manual deployment modes to install helm charts onto your OKE clusters. The creation of various OCI resources like Logging Analytics LogGroup, Entity, Management Agent Install Key is automatically taken care in this approach irrespective of the deployment method that you choose. The required IAM Dynamic Group and Policies for the collection of logs, metrics, objects discovery data into OCI, can be optionally enabled when using this flow.
Customisations are possible through helm once deployed using Logging Analytics Connect Cluster
flow from Console, which is applicable for both Automatic and Manual Deployment modes. We recommend choosing Manual Deployment mode for OKE clusters with Private API Server endpoint, as support for the automatic deployment for the same would be available soon.
Refer this doc for complete instructions on using this approach.
:hourglass_flowing_sand: Connect cluster support for EKS and clusters other than OKE (partially automated flow) would be available soon. Meanwhile, if you would like to experience the Solution for EKS, use helm or other deployment methods.
Prepate Entity metadata which represents Kubernetes Cluster's details.
{"items":[{"name":"cluster","value":"<Cluster_Name>_<Cluster_Creation_Time>","type":"k8s_solution"},{"name":"cluster_date","value":"<Cluster_Creation_Time>","type":"k8s_solution"},{"name":"cluster_name","value":"<Cluster_Name>","type":"k8s_solution"},{"name":"cluster_ocid","value":"<Unique_Identifier_of_Cluster>","type":"k8s_solution"},{"name":"deployment_stack_ocid","value":"NA","type":"k8s_solution"},{"name":"deployment_status","value":"NA","type":"k8s_solution"},{"name":"k8s_version","value":"<Kubernetes_Version>","type":"k8s_solution"},{"name":"metrics_namespace","value":"mgmtagent_kubernetes_metrics","type":"k8s_solution"},{"name":"name","value":"<Cluster_Name>_<Cluster_Creation_Time>","type":"k8s_solution"},{"name":"onm_compartment","value":"<O&M_Compartment_OCID>","type":"k8s_solution"},{"name":"solution_type","value":"<Cluster_Type>","type":"k8s_solution"}]}
Logging Analytics LogGroup
exists. Note that for the Logging Analytics Solution UI to work properly, you must keep all your OCI resources like Logging Analytics LogGroup
, Logging Analytics Entity
, Management Agent Install Key
under the same compartment.Create Logging Analytics Entity of type Kubernetes Cluster using above created metadata.
oci log-analytics entity create --name <Cluster_Name>_<Cluster_Creation_Time> --namespace-name <Tenancy_Namespace> --compartment-id <O&M_Compartment_OCID> --entity-type-name omc_kubernetes_cluster --metadata file://entity_metadata.json
Create OCI Logging Analytics LogGroup(s) if not done already. Refer Create Log Group for details.
Note that for the Logging Analytics Solution UI to work properly, you must keep all your OCI resources like Logging Analytics LogGroup
, Logging Analytics Entity
, Management Agent Install Key
under the same compartment.
Create override_values.yaml, to override the minimum required variables in values.yaml.
global:
# -- OCID for OKE cluster or a unique ID for other Kubernetes clusters.
kubernetesClusterID:
# -- Provide a unique name for the cluster. This would help in uniquely identifying the logs and metrics data at OCI Logging Analytics and OCI Monitoring respectively.
kubernetesClusterName:
oci-onm-logan:
ociLANamespace:
ociLALogGroupID:
ociLAClusterEntityID:
oci-onm-mgmt-agent: mgmtagent:
installKeyFileContent:
Use the following helm install
command to the install the chart. Provide a desired release name, path to override_values.yaml and path to helm chart (oci-onm chart).
helm install <release-name> --values <path-to-override-values.yaml> <path-to-helm-chart>
Refer this for further details on helm install
.
Use the following helm upgrade
command if any further changes to override_values.yaml needs to be applied or a new chart version needs to be deployed.
helm upgrade <release-name> --values <path-to-override-values.yaml> <path-to-helm-chart>
Refer this for further details on helm upgrade
.
Note : If you have lost the override_values.yaml that was used while installing the helm (OR) you need to get the default one that was used while installing using other approaches like OCI Logging Analytics Connect Cluster
, OCI Resource Manager
etc., then run the following command to generate the same.
helm get values <release-name> > override_values.yaml
\OCI Logging Analytics Connect Cluster
is oci-kubernetes-monitoring
.
Dashboards needs to be imported manually. Below is an example for importing Dashboards using OCI CLI.
Download and configure OCI CLI or open cloud-shell where OCI CLI is pre-installed. Alternative methods like REST API, SDK, Terraform etc can also be used.
Find the OCID of the compartment, where the dashboards need to be imported.
Download the dashboard JSONs from here.
Replace all the instances of the keyword - "${compartment_ocid}
" in the JSONs with the Compartment OCID identified in previous step.
Following command is for quick reference that can be used in a linux/cloud-shell environment :
sed -i "s/\${compartment_ocid}/<Replace-with-Compartment-OCID>/g" *.json
Run the following commands to import the dashboards.
oci management-dashboard dashboard import --from-json file://cluster.json
oci management-dashboard dashboard import --from-json file://node.json
oci management-dashboard dashboard import --from-json file://workload.json
oci management-dashboard dashboard import --from-json file://pod.json
oci management-dashboard dashboard import --from-json file://service-type-lb.json
Use the following helm uninstall
command to uninstall the chart. Provide the release name used when creating the chart.
helm uninstall <release-name>
Refer this for further details on helm uninstall
.
Launch OCI Resource Manager Stack in OCI Tenancy and Region of the OKE Cluster, which you want to monitor.
One of the major changes introduced in 3.0.0 is refactoring of helm chart where major features of the solution got split into separate sub-charts. 2.x has only support for logs and objects collection using Fluentd and OCI Logging Analytics and this is now moved into a separate chart oci-onm-logan and included as a sub-chart to the main chart oci-onm. This is a breaking change w.r.t the values.yaml and any customisations that you might have done on top of it. There is no breaking change w.r.t functionality offered in 2.x. For full list of changes in 3.x, refer to changelog.
You may fall into one of the below categories and may need to take actions accordingly.
We recommend you to uninstall the release created using 2.x chart and follow the installation instructions mentioned here for installing the release using 3.x chart.
image:
url: <Container Image URL>
imagePullPolicy: Always
ociLANamespace: <OCI LA Namespace>
ociLALogGroupID: ocid1.loganalyticsloggroup.oc1.phx.amaaaaaa......
kubernetesClusterID: ocid1.cluster.oc1.phx.aaaaaaaaa.......
kubernetesClusterName: <Cluster Name>
global:
# -- OCID for OKE cluster or a unique ID for other Kubernetes clusters.
kubernetesClusterID: ocid1.cluster.oc1.phx.aaaaaaaaa.......
# -- Provide a unique name for the cluster. This would help in uniquely identifying the logs and metrics data at OCI Logging Analytics and OCI Monitoring respectively.
kubernetesClusterName: <Cluster Name>
oci-onm-logan:
# Go to OCI Logging Analytics Administration, click Service Details, and note the namespace value.
ociLANamespace: <OCI LA Namespace>
# OCI Logging Analytics Log Group OCID
ociLALogGroupID: ocid1.loganalyticsloggroup.oc1.phx.amaaaaaa......
If you have modified values.yaml provided in helm chart directly, we recommend you to identify all the changes and move them to override_values.yaml and follow the instructions provided in install or upgrade sections under this. We recommend you to use override_values.yaml for updating values for any variables or to incorporate any customisations on top of existing values.yaml.
If you are already using a separate values.yaml for your customisations, you still need to compare 2.x vs 3.x variable hierarchy and make the necessary changes accordingly.
Copyright (c) 2023, Oracle and/or its affiliates. Licensed under the Universal Permissive License v1.0 as shown at https://oss.oracle.com/licenses/upl.