Closed danwinship closed 9 years ago
Should probably have 'ovs-ofctl -O OpenFlow13 show br0' too so that we get the real port numberss that match up with the ones in the flow tables.
Also lets do 'systemctl status openshift-[master|node]' too; that should get us the command line openshift was launched with and any very recent log messages.
And might as well grab /etc/sysconfig/network-scripts/ifcfg-*, 'nmcli dev', 'nmcli con', and NM journal output too just to isolate any network setup problems that 'ip a' and 'ip r' don't show.
I think we must decide on the operational method of this script (ssh to master/nodes etc). All other improvements can come as further PRs.
I had one proposal:
The above can be done all at once when password-less ssh is allowed between master/nodes; otherwise it will need manual runs.
Pushed a new version:
ovs-ofctl -O OpenFlow13 show br0
as suggested by dcbwsystemctl show openshift-{master,node}
as sort-of-suggested by dcbw; as opposed to systemctl status
, this includes more data and is more machine-parseable. It doesn't include journal output but that's ok because we already have that separately./etc/sysconfig/network-scripts/ifcfg-*
, nmcli -f all dev
, and nmcli -f all con
as suggested by dcbw (but with the addition of -f all
). Doesn't add NM journal output since we already have that in the main journal file.Sample output at http://people.redhat.com/dwinship/openshift-sdn-debug-2015-09-18.tgz
Looks excellent.
This is still slightly a work in progress, but it's basically working (unless people hate what it does and want a total rewrite...).
This adds a script which you can run on the OpenShift master, which will gather data there, on each node, and in each running pod, which can then be sent to a human for debugging purposes. (Automatically diagnosing problems comes next.) Currently this includes:
journalctl --unit openshift-master.service
on the master andjournalctl --unit openshift-node.service
on each nodejournalctl --boot
on the master and each node. (FIXME: too invasive? Might make admins nervous...)ip a
andip r
on the master, each node, and inside each podiptables-save
on the master and each node/etc/hosts
from the master and each nodeoc get nodes -o json
andoc get pods --all-namespaces -o json
brctl show
on each nodenode-config.yaml
from each nodeovs-ofctl -O OpenFlow13 dump-flows br0
on each nodeovs-appctl ofproto/trace
outputs on each node, showing traces of up to four different pairs (send/receive) of pod traffic (as many of "packets between local pods in the same namespace", "packets between local pods in different namespaces", "packets between local and remote pod in the same namespace" and "packets between local and remote pod in different namespaces" as it's possible to show given the currently running pods).One catch is that it requires that root@master be able to ssh to root on each node without needing a password. Alternatively, maybe it would make more sense to have the script run from an outside machine that has the ability to ssh to root at the master and each node, rather than running it from the master?
(To test in the vagrant setup, as root:
)