cloud-native-toolkit / terraform-gitops-ibm-portworx

Module to populate a gitops repository with the resources required to provision Portworx in an OpenShift cluster
Apache License 2.0
0 stars 0 forks source link

Portworx deployments are failing #26

Closed triceam closed 1 year ago

triceam commented 1 year ago

Portworx deployments on IBM Cloud are failing due to a recent change by Portworx. This scenario can be recreated using our automation, or when clusters & portworx deployments are created manually. Deploying Portworx onto a brand new cluster results in:

  1. Portworx never becomes available
  2. The cluster workers end up in a "Disk Pressure" state

To recreate:

  1. Create an OpenShift 4.10 VPC cluster. (Have tried both RHEL7 and RHEL8 workers, and the result is the same in both cases).
  2. Create a Portworx Enterprise service instance, pointed at this cluster (use either this module, or create manually using the IBM Cloud catalog. Select the internal KVDM database option.
  3. The Portworx service shows that it deployed successfully: image
  4. However, the Portworx deployment inside of the cluster never becomes healthy. All pods in the portworx and portworx-api DaemonSets never enter a healthy state, and both events and logs show failures. It looks like the Portworx deployment tries to self-update, and fails. This also ends up eating up disk space on the cluster, and sends the workers into a "Disk Pressure" warning state. image image image image
triceam commented 1 year ago

Support case created: https://cloud.ibm.com/unifiedsupport/cases?number=CS3084464

triceam commented 1 year ago

The disk pressure status appears to be related to a regression in ROKS, which is separate from the Portworx issue. Details/discussion on the disk pressure condition in Slack at: https://ibm-cloudplatform.slack.com/archives/CJH0UPN2D/p1666900631736399?thread_ts=1666896501.625019&cid=CJH0UPN2D

triceam commented 1 year ago

Update:

triceam commented 1 year ago

I started a thread on the Portworx forums here: https://forums.portworx.com/t/portworx-failing-on-openshift-4-10-on-ibm-cloud/1478

triceam commented 1 year ago

I also requested an account on https://pure1.purestorage.com/support so I can create support tickets for Portworx/Pure Storage, but it has to be "approved" before I can submit a ticket.

triceam commented 1 year ago

response on forums:

This implies the kernel is not compatible with the Portworx version. I am assuming you are trying to install Portworx 2.11.4 with Cloud Drives option. We will get PWX version updated (within 48 hours).

triceam commented 1 year ago

Just confirmed Portworx is working again on IBM Cloud, based on update from the Portworx forums thread above.