F5Networks / f5-google-gdm-templates

Google Deployment Templates for quickly deploying BIG-IP services in Google Cloud Platform
28 stars 45 forks source link

Management applications (ssh/https) startup on Nic0 (data) and not Nic1(mgmt) in 3.3.0 template #42

Closed bitraider5813 closed 4 years ago

bitraider5813 commented 4 years ago

Do you already have an issue opened with F5 support? No

Github Issues are consistently monitored by F5 staff, but should be considered as best effort only and you should not expect to receive the same level of response as provided by F5 Support. Please open an case with F5 if this is a critical issue.

Description

Management applications (ssh/https) startup on Nic0 (data) and not Nic1(mgmt) using template version 3.3.0

Attempted using these two images with the same issue: f5-bigip-15-0-1-0-0-11-byol-all-modules-2boot-loc-190803012348 f5-bigip-15-0-1-1-0-0-3-byol-all-modules-2boot-loc-191118

Template

3.3.0 For bugs, enter the template with which you are experiencing issues below.

Severity Level

For bugs, enter the bug severity level. Do not set any labels.

Severity: <Fill in level: 1 through 5> 1 Severity level definitions:

  1. Severity 1 (Critical) : Defect is causing systems to be offline and/or nonfunctional. immediate attention is required.
  2. Severity 2 (High) : Defect is causing major obstruction of system operations.
  3. Severity 3 (Medium) : Defect is causing intermittent errors in system operations.
  4. Severity 4 (Low) : Defect is causing infrequent interuptions in system operations.
  5. Severity 5 (Trival) : Defect is not causing any interuptions to system operations, but none-the-less is a bug.
JeffGiroux commented 4 years ago

Try instance size n1-standard-8 instead of n1-standard-4.

Also provide some logs from /var/log/cloud/gcp

xags commented 4 years ago

Same problem, see logs. cloud-logs.zip

xags commented 4 years ago

Increasing the instance to n1-standard-8 does solve the problem. Why is this necessary?

f5-applebaum commented 4 years ago

Unfortunately, this is due to a bug ( bz742628-2 - tmsh startup is slowing down with more libclischema libraries) that should be fixed in next releases of VE (15.0.1.2 = targeted ~ end of march). It only affects cpu load on initial startup (but that obviously effects initial orchestration here and symptoms are more pronounced with less compute).

asaphef commented 4 years ago

Is there a workaround in-case using a bigger instance doesn't work? after the reboot in collect-interface.sh we cannot connect to the instance via either interface

gwolfis commented 4 years ago

I just walked into the same issue and can confirm that changing the instance from n1-standaard-4 to n1-standard-8 does not deliver the mentioned workaround.

irgoncalves commented 4 years ago

This also affects BIG-IP 14.x. as well. n1-standard-8 seems to be work fine, whoever, sometimes after a reboot I see the VM on a stuck state (with CPU lock-up messages). I've powered it off and started it again and it worked without problems. Update: then need to power-off/power on seemed to be related to GCP. I could not reproduce the same issue in another regions and after one day the issue was completely gone on the region originally I've deployed it.

shyawnkarim commented 4 years ago

Is anyone else still experiencing issues with this?

shyawnkarim commented 4 years ago

Closing.

If anyone it still experiencing trouble with this, please feel free to reopen this issue.