F5Networks / f5-google-gdm-templates

Google Deployment Templates for quickly deploying BIG-IP services in Google Cloud Platform
28 stars 45 forks source link

Cannot connect to the external IP on management interface (NIC1) after a BIG-IP reboot #33

Closed irgoncalves closed 4 years ago

irgoncalves commented 4 years ago

After rebooting a BIG-IP, one can no longer connect to/ping the management interface (NIC1)on the external mgmt IP address.

Template used: Version 3.2.0 https://github.com/F5Networks/f5-google-gdm-templates/tree/master/supported/failover/same-net/via-api/3nic/existing-stack/byol

Severity: 2

shyawnkarim commented 4 years ago

Thanks for reporting this issue to us. I am getting the same behavior after rebooting as well.

I've gone ahead a filed a Jira bug (#1747) with our engineering team to get this fixed.

JeffGiroux commented 4 years ago

I'm running into this issue as well.

  1. start with known working deployment
  2. validate ssh works, test failover and other stuff you want to test
  3. shutdown instances
  4. start up instances
  5. cannot SSH

workaround

I can login to an existing box on the same network as my NIC1 and then access the F5 mgmt NIC1 private IP and SSH works. The public IP mapping or something else...haven't figured it out yet...is not working post reboot.

JeffGiroux commented 4 years ago

More troubleshooting... Once I performed workaround, I was now on the box...but only via accessing the private mgmt IP from an VM on the same net. Public IP access didn't work. I thought it was due to ephemeral pip assignment so I made a new static public IP (pip) but that didn't resolve issue.

So...now I'm on local CLI using private mgmt. I start tcpdump when I attempt an SSH to public IP. All I get is SYN, SYN, SYN.

02:11:24.603585 IP x.x.x.x.52753 > 10.1.1.20.22: Flags [S], seq 2288655172, win 65535, options [mss 1460,nop,wscale 6,nop,nop,TS val 798191662 ecr 0,sackOK,eol], length 0 02:11:25.609220 IP x.x.x.x.52753 > 10.1.1.20.22: Flags [S], seq 2288655172, win 65535, options [mss 1460,nop,wscale 6,nop,nop,TS val 798192662 ecr 0,sackOK,eol], length 0 02:11:26.609347 IP x.x.x.x.52753 > 10.1.1.20.22: Flags [S], seq 2288655172, win 65535, options [mss 1460,nop,wscale 6,nop,nop,TS val 798193662 ecr 0,sackOK,eol], length 0 02:11:27.613526 IP x.x.x.x.52753 > 10.1.1.20.22: Flags [S], seq 2288655172, win 65535, options [mss 1460,nop,wscale 6,nop,nop,TS val 798194662 ecr 0,sackOK,eol], length 0

JeffGiroux commented 4 years ago

A working (not rebooted yet) F5 and its tcpdump for comparison...

my home IP = x.x.x.x (scrubbed) mgmt IP = 10.1.1.27

Complete SYN-SYNACK-ACK

02:19:42.415203 IP x.x.x.x.52845 > 10.1.1.27.22: Flags [S], seq 3608111598, win 65535, options [mss 1460,nop,wscale 6,nop,nop,TS val 798686373 ecr 0,sackOK,eol], length 0 02:19:42.415303 IP 10.1.1.27.22 > x.x.x.x.52845: Flags [S.], seq 937642139, ack 3608111599, win 28160, options [mss 1420,sackOK,TS val 1820951 ecr 798686373,nop,wscale 7], length 0 02:19:42.463730 IP x.x.x.x.52845 > 10.1.1.27.22: Flags [.], ack 1, win 2068, options [nop,nop,TS val 798686420 ecr 1820951], length 0 02:19:42.468571 IP x.x.x.x.52845 > 10.1.1.27.22: Flags [P.], seq 1:22, ack 1, win 2068, options [nop,nop,TS val 798686420 ecr 1820951], length 21

JeffGiroux commented 4 years ago

Final bit of troubleshooting...looks like mgmt route disappeared for default. Hope that helps you in your digging.

[admin@bigip1-jg-f5-api-ha2:Standby:In Sync] ~ # tmsh list sys management-route sys management-route dhclient_route2 { description configured-by-dhcp gateway 10.1.1.1 network 10.1.1.0/24 } sys management-route dhclient_route1 { description configured-by-dhcp network 10.1.1.1/32 type interface } sys management-route default { gateway 10.1.1.1 network default }

But...notice it's missing from netstat output below.

[admin@bigip1-jg-f5-api-ha2:Standby:In Sync] ~ # netstat -rn Kernel IP routing table Destination Gateway Genmask Flags MSS Window irtt Iface 0.0.0.0 10.1.10.1 0.0.0.0 UG 0 0 0 external 10.1.1.0 10.1.1.1 255.255.255.0 UG 0 0 0 mgmt 10.1.1.1 0.0.0.0 255.255.255.255 UH 0 0 0 mgmt 10.1.10.0 10.1.10.1 255.255.255.0 UG 0 0 0 external 10.1.10.1 0.0.0.0 255.255.255.255 UH 0 0 0 external 10.1.20.0 10.1.20.1 255.255.255.0 UG 0 0 0 internal 10.1.20.1 0.0.0.0 255.255.255.255 UH 0 0 0 internal 127.1.1.0 0.0.0.0 255.255.255.0 U 0 0 0 tmm 127.7.0.0 127.1.1.253 255.255.0.0 UG 0 0 0 tmm 127.20.0.0 0.0.0.0 255.255.0.0 U 0 0 0 tmm_bp

shyawnkarim commented 4 years ago

Closing. This issue was fixed with release 3.3.0.

Goobaroo commented 3 years ago

This still happens in release 3.7.0