juju / juju-crashdump

Script to assist in gathering logs and other debugging info from a Juju model
MIT License
10 stars 23 forks source link

juju crashdump sometimes tries to use the wrong ip #60

Open jhobbs opened 4 years ago

jhobbs commented 4 years ago

Not all ip addresses for machines are routable - sometimes it picks the "internal" one. It should, like juju, try them all until one works.

dasm commented 3 years ago

I just had exactly the same problem. I've noticed one of my machines in error state, so I wanted to investigate what happened.

ubuntu@dasm-bastion:~/.local/share/juju$ juju status
Model         Controller                   Cloud/Region             Version  SLA          Timestamp
dasm  dasm-serverstack  serverstack/serverstack  2.8.9    unsupported  15:44:35Z

App                     Version  Status  Scale  Charm                   Store       Rev  OS      Notes
ceph-mon                15.2.8   active      3  ceph-mon                jujucharms   53  ubuntu
ceph-osd                15.2.8   active      3  ceph-osd                jujucharms  308  ubuntu
ceph-radosgw            15.2.8   active      1  ceph-radosgw            jujucharms  294  ubuntu
cinder                  17.0.1   active      1  cinder                  jujucharms  308  ubuntu  exposed
cinder-ceph             17.0.1   active      1  cinder-ceph             jujucharms  260  ubuntu
cinder-mysql-router     8.0.23   active      1  mysql-router            jujucharms    6  ubuntu
dashboard-mysql-router  8.0.23   active      1  mysql-router            jujucharms    6  ubuntu
glance                  21.0.0   active      1  glance                  jujucharms  303  ubuntu  exposed
glance-mysql-router     8.0.23   active      1  mysql-router            jujucharms    6  ubuntu
keystone                18.0.0   active      1  keystone                jujucharms  321  ubuntu  exposed
keystone-mysql-router   8.0.23   active      1  mysql-router            jujucharms    6  ubuntu
mysql-innodb-cluster    8.0.23   active      3  mysql-innodb-cluster    jujucharms    5  ubuntu
neutron-api             17.0.0   active      1  neutron-api             jujucharms  292  ubuntu  exposed
neutron-api-plugin-ovn  17.0.0   active      1  neutron-api-plugin-ovn  jujucharms    4  ubuntu
neutron-mysql-router    8.0.23   active      1  mysql-router            jujucharms    6  ubuntu
nova-cloud-controller   22.0.1   active      1  nova-cloud-controller   jujucharms  352  ubuntu  exposed
nova-compute            22.0.1   active      3  nova-compute            jujucharms  325  ubuntu
nova-mysql-router       8.0.23   active      1  mysql-router            jujucharms    6  ubuntu
ntp                     3.5      active      3  ntp                     jujucharms   44  ubuntu
openstack-dashboard     18.6.1   active      1  openstack-dashboard     jujucharms  311  ubuntu  exposed
ovn-central             20.03.1  active      3  ovn-central             jujucharms    5  ubuntu
ovn-chassis             20.03.1  active      3  ovn-chassis             jujucharms   10  ubuntu
placement               4.0.0    active      1  placement               jujucharms   17  ubuntu
placement-mysql-router  8.0.23   active      1  mysql-router            jujucharms    6  ubuntu
rabbitmq-server         3.8.2    active      1  rabbitmq-server         jujucharms  108  ubuntu
vault                   1.5.4    error       1  vault                   jujucharms   44  ubuntu
vault-mysql-router      8.0.23   active      1  mysql-router            jujucharms    6  ubuntu

Unit                         Workload  Agent  Machine  Public address  Ports              Message
ceph-mon/0                   active    idle   0        10.5.0.4                           Unit is ready and clustered
ceph-mon/1                   active    idle   1        10.5.0.5                           Unit is ready and clustered
ceph-mon/2*                  active    idle   2        10.5.0.15                          Unit is ready and clustered
ceph-osd/0                   active    idle   3        10.5.0.46                          Unit is ready (1 OSD)
ceph-osd/1*                  active    idle   4        10.5.0.27                          Unit is ready (1 OSD)
ceph-osd/2                   active    idle   5        10.5.0.12                          Unit is ready (1 OSD)
ceph-radosgw/0*              active    idle   6        10.5.0.10       80/tcp             Unit is ready
cinder/0*                    active    idle   7        10.5.0.38       8776/tcp           Unit is ready
  cinder-ceph/0*             active    idle            10.5.0.38                          Unit is ready
  cinder-mysql-router/0*     active    idle            10.5.0.38                          Unit is ready
glance/0*                    active    idle   8        10.5.0.30       9292/tcp           Unit is ready
  glance-mysql-router/0*     active    idle            10.5.0.30                          Unit is ready
keystone/0*                  active    idle   9        10.5.0.28       5000/tcp           Unit is ready
  keystone-mysql-router/0*   active    idle            10.5.0.28                          Unit is ready                                                                                                       
mysql-innodb-cluster/0       active    idle   10       10.5.0.21                          Unit is ready: Mode: R/W
mysql-innodb-cluster/1       active    idle   11       10.5.0.44                          Unit is ready: Mode: R/O
mysql-innodb-cluster/2*      active    idle   12       10.5.0.7                           Unit is ready: Mode: R/O
neutron-api/0*               active    idle   13       10.5.0.9        9696/tcp           Unit is ready
  neutron-api-plugin-ovn/0*  active    idle            10.5.0.9                           Unit is ready  
  neutron-mysql-router/0*    active    idle            10.5.0.9                           Unit is ready
nova-cloud-controller/0*     active    idle   14       10.5.0.18       8774/tcp,8775/tcp  Unit is ready
  nova-mysql-router/0*       active    idle            10.5.0.18                          Unit is ready             
nova-compute/0*              active    idle   15       10.5.0.17                          Unit is ready                   
  ntp/0*                     active    idle            10.5.0.17       123/udp            chrony: Ready
  ovn-chassis/0*             active    idle            10.5.0.17                          Unit is ready
nova-compute/1               active    idle   16       10.5.0.6                           Unit is ready
  ntp/1                      active    idle            10.5.0.6        123/udp            chrony: Ready
  ovn-chassis/1              active    idle            10.5.0.6                           Unit is ready
nova-compute/2               active    idle   17       10.5.0.11                          Unit is ready
  ntp/2                      active    idle            10.5.0.11       123/udp            chrony: Ready
  ovn-chassis/2              active    idle            10.5.0.11                          Unit is ready
openstack-dashboard/0*       active    idle   18       10.5.0.20       80/tcp,443/tcp     Unit is ready
  dashboard-mysql-router/0*  active    idle            10.5.0.20                          Unit is ready 
ovn-central/0*               active    idle   19       10.5.0.45       6641/tcp,6642/tcp  Unit is ready (leader: ovnnb_db, ovnsb_db)
ovn-central/1                active    idle   20       10.5.0.14       6641/tcp,6642/tcp  Unit is ready
ovn-central/2                active    idle   21       10.5.0.24       6641/tcp,6642/tcp  Unit is ready (northd: active)
placement/0*                 active    idle   22       10.5.0.52       8778/tcp           Unit is ready 
  placement-mysql-router/0*  active    idle            10.5.0.52                          Unit is ready
rabbitmq-server/0*           active    idle   23       10.5.0.22       5672/tcp           Unit is ready 
vault/0*                     error     idle   24       10.5.0.8        8200/tcp           hook failed: "update-status"
  vault-mysql-router/0*      active    idle            10.5.0.8                           Unit is ready

Machine  State    DNS        Inst id                               Series  AZ    Message       
0        started  10.5.0.4   af868f48-6848-4708-9d9e-f3f9074758d9  focal   nova  ACTIVE        
1        started  10.5.0.5   8013182f-2505-477e-8b43-a3ea656cc728  focal   nova  ACTIVE                 
2        started  10.5.0.15  41127c94-16b8-455a-9500-6922da577990  focal   nova  ACTIVE        
3        started  10.5.0.46  f5405e8f-fb35-4b0d-aa96-4e41d1152904  focal   nova  ACTIVE        
4        started  10.5.0.27  d9e4d129-0a9d-48a1-bf51-17dd00488410  focal   nova  ACTIVE        
5        started  10.5.0.12  aa9c16e9-abfb-4d40-9e0d-061f3bf7140c  focal   nova  ACTIVE                 
6        started  10.5.0.10  42d02804-2ee9-4066-a12d-9ea2e733c3bb  focal   nova  ACTIVE        
7        started  10.5.0.38  52246ec1-5705-4f64-9758-2bf27675a09a  focal   nova  ACTIVE        
8        started  10.5.0.30  16ceb443-623b-41c9-ac57-7bbe8a8eaab6  focal   nova  ACTIVE        
9        started  10.5.0.28  3533b6c9-95b7-4314-953f-3d37e83e66b7  focal   nova  ACTIVE        
10       started  10.5.0.21  d768edb6-a506-4f46-ac3b-7a263b9cd9ec  focal   nova  ACTIVE        
11       started  10.5.0.44  2e6b2b11-eb76-4a85-be96-39cda8e775db  focal   nova  ACTIVE        
12       started  10.5.0.7   0442f43e-f1f8-40a5-a9f4-0fd15eda99a4  focal   nova  ACTIVE        
13       started  10.5.0.9   47c5b116-4121-4d37-bdc2-bc5024db1452  focal   nova  ACTIVE
14       started  10.5.0.18  457cc00f-f6f7-4d24-9721-9f6eb8202e73  focal   nova  ACTIVE          
15       started  10.5.0.17  4718bf65-c570-448f-a119-8ab3f9d4f697  focal   nova  ACTIVE                              
16       started  10.5.0.6   8d6e9f88-1756-4ea4-9afc-5f9328e6256e  focal   nova  ACTIVE                              
17       started  10.5.0.11  f20e619c-86d2-456c-9205-eae0c3e9d9f4  focal   nova  ACTIVE                              
18       started  10.5.0.20  8626ab19-03d2-40f9-ab01-6763043b7f85  focal   nova  ACTIVE                        
19       started  10.5.0.45  217f5495-b46e-4315-9f88-b13503eb0009  focal   nova  ACTIVE                        
20       started  10.5.0.14  ec3c8ae9-b4df-4437-9295-5b775019a16a  focal   nova  ACTIVE                        
21       started  10.5.0.24  2c421863-5626-4e38-9326-63d498d927ce  focal   nova  ACTIVE                
22       started  10.5.0.52  4d623f88-e57d-403f-bd6b-a13cb9fe885a  focal   nova  ACTIVE                
23       started  10.5.0.22  adf28f27-1881-400a-b221-16d5b8efcd8b  focal   nova  ACTIVE                
24       started  10.5.0.8   0e8ce15e-082a-4620-b599-ce05f2e6c02d  focal   nova  ACTIVE                

juju tried to access data from 10.5.0.42 which doesn't exist in my pool of machines.

ubuntu@dasm-bastion:~$ juju crashdump -s                                    
2021-03-22 15:36:39,700 - juju-crashdump started.                                      
2021-03-22 15:36:56,854 - Command "timeout 45s ssh -o StrictHostKeyChecking=no -i ~/.local/share/juju/ssh/juju_id_rsa ubuntu@10.5.0.42 sudo 'mkdir -p /tmp/732596b1-0238-4792-8786-ea3743c76897/cmd_output;sud
o netstat -taupn | grep LISTEN 2>/dev/null | sudo tee /tmp/732596b1-0238-4792-8786-ea3743c76897/cmd_output/listening.txt || true'" failed
2021-03-22 15:36:59,925 - Command "timeout 45s ssh -o StrictHostKeyChecking=no -i ~/.local/share/juju/ssh/juju_id_rsa ubuntu@10.5.0.42 sudo 'mkdir -p /tmp/732596b1-0238-4792-8786-ea3743c76897/cmd_output;sud
o ps aux | sudo tee /tmp/732596b1-0238-4792-8786-ea3743c76897/cmd_output/psaux.txt || true'" failed
2021-03-22 15:37:05,557 - Command "timeout 45s ssh -o StrictHostKeyChecking=no -i ~/.local/share/juju/ssh/juju_id_rsa ubuntu@10.5.0.42 sudo 'sudo find /etc/alternatives /etc/ceilometer /etc/ceph /etc/cinder
 /etc/cloud /etc/glance /etc/gnocchi /etc/keystone /etc/netplan /etc/network /etc/neutron /etc/nova /etc/quantum /etc/swift /etc/udev/rules.d /lib/udev/rules.d /opt/nedge/var/log /run/cloud-init /usr/share/
lxc/config /var/lib/charm /var/lib/libvirt/filesystems/plumgrid-data/log /var/lib/libvirt/filesystems/plumgrid/var/log /var/lib/cloud/seed /var/log /var/snap/simplestreams/common/sstream-mirror-glance.log /
var/crash /var/snap/juju-db/common/logs/ /var/lib/mysql/*-mysql-router /tmp/juju-exec*/script.sh /var/lib/lxd/containers/*/rootfs/etc/alternatives /var/lib/lxd/containers/*/rootfs/etc/ceilometer /var/lib/lx
d/containers/*/rootfs/etc/ceph /var/lib/lxd/containers/*/rootfs/etc/cinder /var/lib/lxd/containers/*/rootfs/etc/cloud /var/lib/lxd/containers/*/rootfs/etc/glance /var/lib/lxd/containers/*/rootfs/etc/gnocchi
 /var/lib/lxd/containers/*/rootfs/etc/keystone /var/lib/lxd/containers/*/rootfs/etc/netplan /var/lib/lxd/containers/*/rootfs/etc/network /var/lib/lxd/containers/*/rootfs/etc/neutron /var/lib/lxd/containers/
*/rootfs/etc/nova /var/lib/lxd/containers/*/rootfs/etc/quantum /var/lib/lxd/containers/*/rootfs/etc/swift /var/lib/lxd/containers/*/rootfs/etc/udev/rules.d /var/lib/lxd/containers/*/rootfs/lib/udev/rules.d
/var/lib/lxd/containers/*/rootfs/opt/nedge/var/log /var/lib/lxd/containers/*/rootfs/run/cloud-init /var/lib/lxd/containers/*/rootfs/usr/share/lxc/config /var/lib/lxd/containers/*/rootfs/var/lib/charm /var/l
ib/lxd/containers/*/rootfs/var/lib/libvirt/filesystems/plumgrid-data/log /var/lib/lxd/containers/*/rootfs/var/lib/libvirt/filesystems/plumgrid/var/log /var/lib/lxd/containers/*/rootfs/var/lib/cloud/seed /va
r/lib/lxd/containers/*/rootfs/var/log /var/lib/lxd/containers/*/rootfs/var/snap/simplestreams/common/sstream-mirror-glance.log /var/lib/lxd/containers/*/rootfs/var/crash /var/lib/lxd/containers/*/rootfs/var
/snap/juju-db/common/logs/ /var/lib/lxd/containers/*/rootfs/var/lib/mysql/*-mysql-router /var/lib/lxd/containers/*/rootfs/tmp/juju-exec*/script.sh -mount -type f -size -5000000c -o -size 5000000c 2>/dev/nul
l | sudo tar -pcf /tmp/juju-dump-732596b1-0238-4792-8786-ea3743c76897.tar --files-from - 2>/dev/null;sudo tar --append -f /tmp/juju-dump-732596b1-0238-4792-8786-ea3743c76897.tar -C /tmp/732596b1-0238-4792-8
786-ea3743c76897/cmd_output . || true;sudo tar --append -f /tmp/juju-dump-732596b1-0238-4792-8786-ea3743c76897.tar -C /tmp/732596b1-0238-4792-8786-ea3743c76897/ journalctl || true;sudo tar --append -f /tmp/
juju-dump-732596b1-0238-4792-8786-ea3743c76897.tar -C /tmp/732596b1-0238-4792-8786-ea3743c76897/addon_output . || true'" failed
2021-03-22 15:37:29,334 - Command "scp -o StrictHostKeyChecking=no -i ~/.local/share/juju/ssh/juju_id_rsa ubuntu@10.5.0.42:/tmp/juju-dump-732596b1-0238-4792-8786-ea3743c76897.tar 20f8e194-7d03-40c1-a727-fa1
29df68be9.tar" failed                            

When I tried to gather extra logs to analyze its behavior, I reran command with debug juju crashdump -l debug -s, but this time, there was no problem, and everything worked.

ubuntu@dasm-bastion:~/.local/share/juju$ juju crashdump -l debug -s                                                                                                                                
2021-03-22 15:41:26,123 - juju-crashdump started.                                                                                                                                                             
2021-03-22 15:41:26,125 - Calling juju version                                                                                                                                                                
2021-03-22 15:41:26,233 - Returned from juju version                                                                                                                                                          
2021-03-22 15:41:26,234 - Calling juju switch                                                                                                                                                                 
2021-03-22 15:41:26,325 - Returned from juju switch                                                                                                                                                           
2021-03-22 15:41:26,326 - Calling juju  status --format=yaml                                                                                                                                                  
2021-03-22 15:41:27,240 - Returned from juju  status --format=yaml                                                                                                                                            
2021-03-22 15:41:27,241 - Calling juju  status --format=tabular --relations --storage                                                                                                                         
2021-03-22 15:41:28,113 - Returned from juju  status --format=tabular --relations --storage                                                                                                                   
2021-03-22 15:41:28,113 - Calling juju debug-log --date --replay --no-tail                                                                                                                                    
2021-03-22 15:41:35,741 - Returned from juju debug-log --date --replay --no-tail                                                                                                                              
2021-03-22 15:41:35,745 - Calling juju model-config --format=yaml                                                                                                                                             
2021-03-22 15:41:35,946 - Returned from juju model-config --format=yaml                                                                                                                                       
2021-03-22 15:41:35,946 - Calling juju storage --format=yaml                                                                                                                                                  
2021-03-22 15:41:36,586 - Returned from juju storage --format=yaml                                                                                                                                            
2021-03-22 15:41:36,587 - Calling juju storage-pools --format=yaml                                                                                                                                            
2021-03-22 15:41:37,188 - Returned from juju storage-pools --format=yaml                                                                                                                                      
[...]
2021-03-22 15:44:14,377 - Returned from tar -pacf juju-crashdump-ff274cbe-b25a-40d4-aebc-05cf32c30796.tar.xz * 2>/dev/null                                                                                    
2021-03-22 15:44:14,784 - juju-crashdump finished.