ibm-cloud-docs / monitoring

Documentation repository for monitoring
1 stars 11 forks source link

LogDNA /sysdig integration to Cluster #50

Open testrashmi opened 8 months ago

testrashmi commented 8 months ago

Hello Team,

We reached out here too: https://ibm-cloudplatform.slack.com/archives/C03UVUZFK/p1698811043493859

This was a feedback from a White Gloves Premium Customer: CS3633719

This is with regard to Integration of LOGDNA and Sysdig from Kubernetes/Openshift Cluster Console page:

image

There is no mention of the below details in Cloud Docs:

  1. When Connect is selected: The Logdna or sysdig agents automatically spin up. The agents do not have to be explicitly installed in the host machines.

  2. When disconnect is selected: The Logdna or sysdig agents terminate. Namespace is not deleted explicitly. User has to delete the namespace

  3. “logdna-agent” already exists, still appears when executing the commands, even after agents are terminated . Example: oc adm new-project --node-selector='' ibm-observe Install the OpenShift DaemonSet: oc create -f https://assets.au-syd.logging.cloud.ibm.com/clients/logdna-agent/3/agent-resources-openshift.yaml -n ibm-observe or oc create -f https://assets.au-syd.logging.cloud.ibm.com/clients/logdna-agent/3/agent-resources-openshift-private.yaml -n ibm-observe Something like these errors appear: Error from server (AlreadyExists): error when creating “https://assets.au-syd.logging.cloud.ibm.com/clients/logdna-agent/3.8.2/agent-resources-openshift-private.yaml”: priorityclasses.scheduling.k8s.io “logdna-agent-ds-priority” already exists Error from server (AlreadyExists): error when creating “https://assets.au-syd.logging.cloud.ibm.com/clients/logdna-agent/3.8.2/agent-resources-openshift-private.yaml”: clusterroles.rbac.authorization.k8s.io “logdna-agent” already exists Error from server (AlreadyExists): error when creating “https://assets.au-syd.logging.cloud.ibm.com/clients/logdna-agent/3.8.2/agent-resources-openshift-private.yaml”: clusterrolebindings.rbac.authorization.k8s.io “logdna-agent” already exists


How to overcome this :

Replace Create with delete in the above command:

Example: oc delete -f https://assets.au-syd.logging.cloud.ibm.com/clients/logdna-agent/3.8.2/agent-resources-openshift-private.yaml -n ibm-observe

Warning: deleting cluster-scoped resources, not scoped to the provided namespace priorityclass.scheduling.k8s.io “logdna-agent-ds-priority” deleted role.rbac.authorization.k8s.io “logdna-agent” deleted rolebinding.rbac.authorization.k8s.io “logdna-agent” deleted clusterrole.rbac.authorization.k8s.io “logdna-agent” deleted clusterrolebinding.rbac.authorization.k8s.io “logdna-agent” deleted daemonset.apps “logdna-agent” deleted

On executing this command again:

oc create -f https://assets.au-syd.logging.cloud.ibm.com/clients/logdna-agent/3.8.2/agent-resources-openshift-private.yaml -n ibm-observe

Error is not seen.


testrashmi commented 8 months ago

More detailed steps explained here---

Pre requisites:

LogDNA and sysdig instance should already have been provisioned Provision the OpenShift or Kubernetes cluster Logging and Monitoring is enabled while provisioning the cluster

image

After Cluster is provisioned, integration in console page looks like this:

image

Case 1: From the above screen, it means Logging and Monitoring instances are integrated already

Let’s check if LogDNA and Sysdig agents are running

From CLI or Cloud shell:-

kubectl get pods -A

agents are shown in ibm-observe namespace

image

kubectl get pods -n ibm-observe

Agents are shown running

image

Case 2: Lets disconnect logging/monitoring.

Under Integration: of OpenShift cluster console page

Select Disconnect for logging

image

From CLI or Cloud shell:- After disconnect, the logdna agents are terminating and gets deleted

image image

Select Disconnect for Monitoring

image

Integration section of the Cluster console page looks like this

image

After disconnect, the sysdig agents are terminating and gets deleted

Agents for both Log and sysdig are removed

image

kubectl get ns

Presence of namespace ibm-observe will not affect, however it can be deleted

image

kubectl delete ns ibm-observe

image image

There are no traces of ibm-observe and agents too

Case 3:Adding agents manually from Logging sources and Clicking Connect[with existing instance] Under Integration of cluster console page.

Pre -requisite:- There is no namespace ‘ibm-observe’ and no pods with this namespace

You add the namespace “ibm-observe” explicitly.

image

oc adm new-project --node-selector='' ibm-observe oc create serviceaccount logdna-agent -n ibm-observe

image

No pods running in that namespace as the cluster is not connected to logdna or sysdig

image image

Now explicitly integrate with LogDNA

Under Integration: Logging Click Connect[with existing instance]

image image

Once integrated with logdna, agents come up

image

Disconnect from LogDNA instance

image image

Agents terminate

image image image image image

Case 4:

"logdna-agent" already exists, still appears when executing the commands, even after agents are terminated from the above steps as in case 1 or 2 or 3

Steps to reproduce:-

Navigate here:

image

Execute: oc adm new-project --node-selector='' ibm-observe

Install the OpenShift DaemonSet: oc create -f https://assets.au-syd.logging.cloud.ibm.com/clients/logdna-agent/3/agent-resources-openshift.yaml -n ibm-observe or oc create -f https://assets.au-syd.logging.cloud.ibm.com/clients/logdna-agent/3/agent-resources-openshift-private.yaml -n ibm-observe

image

Something like these errors appear:

Error from server (AlreadyExists): error when creating "https://assets.au-syd.logging.cloud.ibm.com/clients/logdna-agent/3.8.2/agent-resources-openshift-private.yaml": priorityclasses.scheduling.k8s.io "logdna-agent-ds-priority" already exists Error from server (AlreadyExists): error when creating "https://assets.au-syd.logging.cloud.ibm.com/clients/logdna-agent/3.8.2/agent-resources-openshift-private.yaml": clusterroles.rbac.authorization.k8s.io "logdna-agent" already exists Error from server (AlreadyExists): error when creating "https://assets.au-syd.logging.cloud.ibm.com/clients/logdna-agent/3.8.2/agent-resources-openshift-private.yaml": clusterrolebindings.rbac.authorization.k8s.io "logdna-agent" already exists

How to overcome this :

Replace Create with delete

Example: oc delete -f https://assets.au-syd.logging.cloud.ibm.com/clients/logdna-agent/3.8.2/agent-resources-openshift-private.yaml -n ibm-observe

Warning: deleting cluster-scoped resources, not scoped to the provided namespace priorityclass.scheduling.k8s.io "logdna-agent-ds-priority" deleted role.rbac.authorization.k8s.io "logdna-agent" deleted rolebinding.rbac.authorization.k8s.io "logdna-agent" deleted clusterrole.rbac.authorization.k8s.io "logdna-agent" deleted clusterrolebinding.rbac.authorization.k8s.io "logdna-agent" deleted daemonset.apps "logdna-agent" deleted

On executing this command again: oc create -f https://assets.au-syd.logging.cloud.ibm.com/clients/logdna-agent/3.8.2/agent-resources-openshift-private.yaml -n ibm-observe

Error is not seen.

testrashmi commented 8 months ago

logdna sysdig agent.docx

vanadiscrawford commented 8 months ago

I have opened an issue in our Observability Docs repo to look into this: https://github.ibm.com/Observability/docs/issues/420