RamenDR / ramen

Apache License 2.0
72 stars 52 forks source link

Add simple gather tool #1345

Closed nirs closed 3 months ago

nirs commented 4 months ago

The tool gathers all api resources and pod container logs from specified clusters.

For every api resource, the tool gathers all resources and dump them into:

If the _do_kind() method exist for a resource kind, the function is called to do more processing.

We have special processing for:

Extra data collected for a resource is stored in a directory near the resource yaml:

Example pod data:

gather/
  dr1/
    namespaces/
      rook-ceph/
        pods/
          rook-ceph-rbd-mirror-a-58b8454f67-9lx5c/
            rbd-mirror/
              current.log
              previous.log
            log-collector/
              current.log
              previous.log
          rook-ceph-rbd-mirror-a-58b8454f67-9lx5c.yaml

Example node data:

gather/
  dr1/
    cluster/
      nodes/
        dr1/
          rook-ceph/
            log/
              1319a92b-2ab0-4e58-8d74-c81e5edab5c1-client.rbd-mirror-peer.log
              1319a92b-2ab0-4e58-8d74-c81e5edab5c1-client.rbd-mirror-peer.log.1.gz
              ceph-client.ceph-exporter.log
              ceph-client.rbd-mirror.a.log
              ceph-client.rbd-mirror.a.log.1.gz
        dr1.yaml

Commands output is stored in a special "commands" directory under the cluster directory:

gather/
  dr1/
    commands/
      ceph-status
      ceph-osd-blocklist-ls

Example usage - entire env

Example run on a regional DR environment with one protected application runnning for 90 minutes:

$ scripts/gather dr1
2024-04-20 19:48:12,929 INFO    [dr1] Gathering data from cluster dr1
2024-04-20 19:48:25,343 INFO    [dr1] Gathered data in 12.413 seconds

$ scripts/gather dr2
2024-04-20 19:48:38,688 INFO    [dr2] Gathering data from cluster dr2
2024-04-20 19:48:51,138 INFO    [dr2] Gathered data in 12.450 seconds

$ scripts/gather hub
2024-04-20 19:48:56,486 INFO    [hub] Gathering data from cluster hub
2024-04-20 19:49:05,338 INFO    [hub] Gathered data in 8.852 seconds

$ du -sh gather/*
41M     gather/dr1
40M     gather/dr2
16M     gather/hub

$ tar czf gather.tar.gz gather

$ du -h gather.tar.gz
12M     gather.tar.gz

Example usage single - namespace

$ scripts/gather dr1 -n deployment-rbd -o gather-deployment-rbd.1
2024-04-20 21:07:41,712 INFO    [dr1] Gathering data from cluster dr1
2024-04-20 21:07:43,695 INFO    [dr1] Gathered data in 1.984 seconds

$ scripts/gather dr1 -n deployment-rbd -o gather-deployment-rbd.2
2024-04-20 21:45:39,691 INFO    [dr1] Gathering data from cluster dr1
2024-04-20 21:45:41,865 INFO    [dr1] Gathered data in 2.174 seconds

$ diff -ur gather-deployment-rbd.1 gather-deployment-rbd.2
...

Example tarballs

Status

Fixes #1283

nirs commented 3 months ago

Replaced with a kubectl plugin.