scality / metalk8s

An opinionated Kubernetes distribution with a focus on long-term on-prem deployments
Apache License 2.0
360 stars 45 forks source link

Extend `metalk8s-utils` image content #2156

Open NicolasT opened 4 years ago

NicolasT commented 4 years ago

If thanks to #2146 the metalk8s-utils image starts to get used more often, we must ensure it contains all tools people use to debug/troubleshoot a system. We currently only include a couple:

I bet there's a bunch of other useful tools we could/should add. This ticket is meant to collect those, so please comment.

NicolasT commented 4 years ago

Some I was thinking of (in no particular order):

slaperche-scality commented 4 years ago

I would add:

dfc may also be a nice alternative todf (color, export to CSV, filter by FS type, …).

Maybe also a top-like for the network (jnettop or nethogs comes to mind).

tshark may also be useful if one want to do more advanced analysis than what's supported by tcpdump.

Dunno if nmap would make sense as troubleshooting tool?

gdemonet commented 4 years ago

There was a talk in KubeCon NA '19 about how to use a sidecar container for debugging and profiling, included a bunch of useful tools, maybe it would be worth considering as well: https://kccncna19.sched.com/event/UaXU/debugging-live-applications-the-kubernetes-way-from-a-sidecar-joe-elliott-grafana-labs

gdemonet commented 4 years ago

@NicolasT you suggested kubectl, salt and the likes, which we already have available in our repos. Is it really necessary to pre-install them in this image (which size would thus grow), or could we simply setup our internal repositories in the image and let users install whatever they want / even build other images on top of this one?

slaperche-scality commented 4 years ago

Preinstalling them would allow to use them even if our repos are down/broken for some reasons, no?

NicolasT commented 4 years ago

There was a talk in KubeCon NA '19 about how to use a sidecar container for debugging and profiling, included a bunch of useful tools, maybe it would be worth considering as well: https://kccncna19.sched.com/event/UaXU/debugging-live-applications-the-kubernetes-way-from-a-sidecar-joe-elliott-grafana-labs

The intent is (for now) not to act as a debug container / detachable sidecar (we don't support that yet). However, that could be useful over time.

NicolasT commented 4 years ago

@NicolasT you suggested kubectl, salt and the likes, which we already have available in our repos. Is it really necessary to pre-install them in this image (which size would thus grow), or could we simply setup our internal repositories in the image and let users install whatever they want / even build other images on top of this one?

The idea is to come up with a couple of Pod manifests that could be used to deploy a container with this image and get a shell, and be at various degrees of 'host-level access'.

As such, we could create such Pod templates to e.g. run on the salt-master node and export /var/run/salt from the host in the container so the salt tools work, same for salt-call on non-salt-master nodes, expose /etc/kubernetes/admin.conf into a container,...

Then also, run such container as privileged, expose the host FS into the container,...

So indeed, some of the tools could be installed on the host. However, for 'ease of use', it may be useful to have a container which includes all those tools, and have a way to deploy it on some host, then troubleshoot things using those tools from inside the container, but as if you're on the host (to a large extent).

NicolasT commented 4 years ago

One more: nsenter

NicolasT commented 4 years ago

https://github.com/containernetworking/cni/tree/master/cnitool could be useful as well.

NicolasT commented 4 years ago

https://github.com/microsoft/ethr which can, unlike iperf(3), actually fill a 25Gbit/s+ pipe.

NicolasT commented 4 years ago

less, anyone?

NicolasT commented 4 years ago

Re-opening since not everything listed here in included through #2374.

NicolasT commented 3 years ago

One to add, since the image is based on CentOS 7: gdisk.