Closed trenukarya-px closed 4 months ago
Testrail 2.6.0 milestone images are uploaded. 1 test failure because of 3 low priority/severity ticks - PB-4363, PB-4806, PB-4805
Compatibility Matrix for IKS/ROKS:
IKS: 1.27.0 1.28.4
ROKS: 4.12.44 4.13.13
@trenukarya-px , following are the supported IKS and ROKS versions
IKS versions: 1.26, 1.27, 1.28
ROKS versions: 4.12, 4.13, 4.14
Can you please let us know what is the plan to support remaining IKS/ROKS versions like IKS: 1.26 and ROKS: 4.14?
if any user using these IKS and ROKS versions how user will use PX-Backup, will catalog allow px-backup instance creation or not?
cc @ambiknai
Also PX Backup 2.5.1 was supported in
As 2.6.0 is not supported in these versions how can user deploy px-backup on those clusters.
@ambiknai We got note for only these validations from Vipin Panavil Kallat as IBM request. Hence we considered these for 2.6.0. We can consider 4.14 in next version of Px Backup. And, we support only N-2 versions for any K8s flavors. Customer have to upgrade their K8s versions to use PX Backup 2.6.0 version.
Cc: @kshithijiyer-px
@trenukarya-px , Can you please share doc link where this supported N-2 versions mentioned, I mean to say any way where user can see what is supported and what not.
@arahamad @ambiknai 1.2.6 IKS is also qualified.
https://docs.portworx.com/portworx-backup-on-prem/install/install-prereq captures the compatibility matrix. But currently it is not limited to N-2 but going forward, it will be N-2 for all versions support.
I tried to execute torpedo tests on different-2 IKS and ROKS cluster versions and looks it failed on all clusters
@trenukarya-px and @ambiknai , can you please check results torpedo-out_for_IKS_1.26.11.txt torpedo-out_for_IKS_1.27.8.txt torpedo-out_for_IKS_1.28.4.txt torpedo-out_for_IKS_1.29.0.txt torpedo-out_for_ROKS_4.12.44.txt torpedo-out_for_ROKS_4.13.13.txt torpedo-out_for_ROKS_4.14.6.txt
@trenukarya-px
I tried for IKS 1.28.4
and this is what I see
Labels: <none>
Annotations: volume.beta.kubernetes.io/storage-provisioner: vpc.block.csi.ibm.io
volume.kubernetes.io/storage-provisioner: vpc.block.csi.ibm.io
Finalizers: [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode: Filesystem
Used By: postgres-dc55cfcc4-fqkg6
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Provisioning 2m2s (x8 over 4m9s) vpc.block.csi.ibm.io_ibm-vpc-block-csi-controller-0_35978e61-91d4-46c3-ae0b-9ced3936b953 External provisioner is provisioning volume for claim "postgres-csi-pxb-0-83717-01-11-03h24m24s/postgres-data"
Warning ProvisioningFailed 2m2s (x8 over 4m9s) vpc.block.csi.ibm.io_ibm-vpc-block-csi-controller-0_35978e61-91d4-46c3-ae0b-9ced3936b953 failed to provision volume with StorageClass "postgres-sc": error getting secret rook-csi-cephfs-provisioner in namespace openshift-storage: secrets "rook-csi-cephfs-provisioner" not found
Normal ExternalProvisioning 8s (x18 over 4m9s) persistentvolume-controller Waiting for a volume to be created either by the external provisioner 'vpc.block.csi.ibm.io' or manually by the system administrator. If volume creation is delayed, please verify that the provisioner is running and correctly registered.
ambikanair@Ambikas-MBP customer-notifications %
ambikanair@Ambikas-MBP customer-notifications %
ambikanair@Ambikas-MBP customer-notifications %
ambikanair@Ambikas-MBP customer-notifications % kubectl logs -n kube-system ibm-vpc-block-csi-controller-0 -c iks-vpc-block-driver > iks-vpc-block-driver.txt
ambikanair@Ambikas-MBP customer-notifications % vi iks-vpc-block-driver.txt
ambikanair@Ambikas-MBP customer-notifications %
ambikanair@Ambikas-MBP customer-notifications %
ambikanair@Ambikas-MBP customer-notifications % kubectl describe sc postgres-sc
Name: postgres-sc
IsDefaultClass: No
Annotations: description=Provides RWO and RWX Filesystem volumes
Provisioner: vpc.block.csi.ibm.io
Parameters: clusterID=openshift-storage,csi.storage.k8s.io/controller-expand-secret-name=rook-csi-cephfs-provisioner,csi.storage.k8s.io/controller-expand-secret-namespace=openshift-storage,csi.storage.k8s.io/node-stage-secret-name=rook-csi-cephfs-node,csi.storage.k8s.io/node-stage-secret-namespace=openshift-storage,csi.storage.k8s.io/provisioner-secret-name=rook-csi-cephfs-provisioner,csi.storage.k8s.io/provisioner-secret-namespace=openshift-storage,fsName=ocs-storagecluster-cephfilesystem
AllowVolumeExpansion: True
MountOptions: <none>
ReclaimPolicy: Delete
VolumeBindingMode: Immediate
Events: <none>
ambikanair@Ambikas-MBP customer-notifications %
Has something changed from torpedo side.
@Kshithij Iyer @.***> Can you please check this failure.
On Thu, Jan 11, 2024 at 9:22 AM Ambika Nair @.***> wrote:
@trenukarya-px https://github.com/trenukarya-px I tried for IKS 1.28.4 and this what I see
Labels:
Annotations: volume.beta.kubernetes.io/storage-provisioner: vpc.block.csi.ibm.io volume.kubernetes.io/storage-provisioner: vpc.block.csi.ibm.io Finalizers: [kubernetes.io/pvc-protection] Capacity: Access Modes: VolumeMode: Filesystem Used By: postgres-dc55cfcc4-fqkg6 Events: Type Reason Age From Message
Normal Provisioning 2m2s (x8 over 4m9s) vpc.block.csi.ibm.io_ibm-vpc-block-csi-controller-0_35978e61-91d4-46c3-ae0b-9ced3936b953 External provisioner is provisioning volume for claim "postgres-csi-pxb-0-83717-01-11-03h24m24s/postgres-data" Warning ProvisioningFailed 2m2s (x8 over 4m9s) vpc.block.csi.ibm.io_ibm-vpc-block-csi-controller-0_35978e61-91d4-46c3-ae0b-9ced3936b953 failed to provision volume with StorageClass "postgres-sc": error getting secret rook-csi-cephfs-provisioner in namespace openshift-storage: secrets "rook-csi-cephfs-provisioner" not found Normal ExternalProvisioning 8s (x18 over 4m9s) persistentvolume-controller Waiting for a volume to be created either by the external provisioner 'vpc.block.csi.ibm.io' or manually by the system administrator. If volume creation is delayed, please verify that the provisioner is running and correctly registered. @. customer-notifications % @. customer-notifications % @. customer-notifications % @. customer-notifications % kubectl logs -n kube-system ibm-vpc-block-csi-controller-0 -c iks-vpc-block-driver > iks-vpc-block-driver.txt @. customer-notifications % vi iks-vpc-block-driver.txt @. customer-notifications % @. customer-notifications % @. customer-notifications % kubectl describe sc postgres-sc Name: postgres-sc IsDefaultClass: No Annotations: description=Provides RWO and RWX Filesystem volumes Provisioner: vpc.block.csi.ibm.io Parameters: clusterID=openshift-storage,csi.storage.k8s.io/controller-expand-secret-name=rook-csi-cephfs-provisioner,csi.storage.k8s.io/controller-expand-secret-namespace=openshift-storage,csi.storage.k8s.io/node-stage-secret-name=rook-csi-cephfs-node,csi.storage.k8s.io/node-stage-secret-namespace=openshift-storage,csi.storage.k8s.io/provisioner-secret-name=rook-csi-cephfs-provisioner,csi.storage.k8s.io/provisioner-secret-namespace=openshift-storage,fsName=ocs-storagecluster-cephfilesystem AllowVolumeExpansion: True MountOptions:
ReclaimPolicy: Delete VolumeBindingMode: Immediate Events: @.*** customer-notifications % Has something change from torpedo side.
— Reply to this email directly, view it on GitHub https://github.com/IBM/signoff-pxb/issues/17#issuecomment-1886182968, or unsubscribe https://github.com/notifications/unsubscribe-auth/BABEXCYOYMP2ZFTCQ3TYFWTYN5OXFAVCNFSM6AAAAABA4IGNGWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBWGE4DEOJWHA . You are receiving this because you were mentioned.Message ID: @.***>
@ambiknai From the error which I see in the output you have shared it looks like the rook-csi-cephfs-provisioner
secret is missing in openshift-storage
namespace which is causing the PVC provisioning to fail.
error getting secret rook-csi-cephfs-provisioner in namespace openshift-storage: secrets "rook-csi-cephfs-provisioner" not found
This looks like a setup issue (IKS cluster) deployment issue and not a Px-Backup issue. From torpedo side there are no changes please check if there are any platform changes from the IBM cloud side.
ambikanair@Ambikas-MBP customer-notifications % kubectl config current-context
vpc-us-south-mzaznzg1/cmfkova20ie553c09bj0/admin
ambikanair@Ambikas-MBP customer-notifications %
ambikanair@Ambikas-MBP customer-notifications % kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
10.240.1.10 Ready <none> 170m v1.28.4+IKS 10.240.1.10 10.240.1.10 Ubuntu 20.04.6 LTS 5.4.0-169-generic containerd://1.7.11
10.240.1.11 Ready <none> 169m v1.28.4+IKS 10.240.1.11 10.240.1.11 Ubuntu 20.04.6 LTS 5.4.0-169-generic containerd://1.7.11
10.240.1.8 Ready <none> 168m v1.28.4+IKS 10.240.1.8 10.240.1.8 Ubuntu 20.04.6 LTS 5.4.0-169-generic containerd://1.7.11
10.240.1.9 Ready <none> 168m v1.28.4+IKS 10.240.1.9 10.240.1.9 Ubuntu 20.04.6 LTS 5.4.0-169-generic containerd://1.7.11
ambikanair@Ambikas-MBP customer-notifications % kubectl get ns | grep openshift-storage
ambikanair@Ambikas-MBP customer-notifications %
ambikanair@Ambikas-MBP customer-notifications %
ambikanair@Ambikas-MBP customer-notifications % kubectl get ns
NAME STATUS AGE
central Active 162m
default Active 179m
ibm-cert-store Active 169m
ibm-operators Active 179m
ibm-system Active 179m
kube-node-lease Active 179m
kube-public Active 179m
kube-system Active 179m
postgres-csi-pxb-0-58058-01-11-03h40m49s Active 89m
postgres-csi-pxb-0-58075-01-11-03h57m29s Active 72m
postgres-csi-pxb-0-83717-01-11-03h24m24s Active 105m
postgres-csi-pxb-0-84755-01-11-03h32m42s Active 97m
postgres-csi-pxb-0-84851-01-11-03h48m55s Active 81m
postgres-csi-pxb-1-83717-01-11-03h24m24s Active 105m
postgres-csi-pxb-1-84851-01-11-03h48m55s Active 81m
postgres-csi-pxb-2-83717-01-11-03h24m24s Active 105m
postgres-csi-pxb-2-84851-01-11-03h48m55s Active 81m
postgres-csi-pxb-3-83717-01-11-03h24m24s Active 105m
postgres-csi-pxb-3-84851-01-11-03h48m55s Active 81m
postgres-csi-pxb-4-83717-01-11-03h24m24s Active 105m
postgres-csi-pxb-4-84851-01-11-03h48m55s Active 80m
postgres-csi-pxb-5-83717-01-11-03h24m24s Active 105m
postgres-csi-pxb-5-84851-01-11-03h48m55s Active 80m
postgres-csi-pxb-6-84851-01-11-03h48m55s Active 80m
postgres-csi-pxb-7-84851-01-11-03h48m55s Active 80m
postgres-csi-pxb-8-84851-01-11-03h48m55s Active 80m
postgres-csi-pxb-9-84851-01-11-03h48m55s Active 80m
ambikanair@Ambikas-MBP customer-notifications %
ambikanair@Ambikas-MBP customer-notifications % kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
10.240.11.48 Ready <none> 56d v1.27.6+IKS 10.240.11.48 10.240.11.48 Ubuntu 20.04.6 LTS 5.4.0-165-generic containerd://1.7.7
10.240.11.62 Ready <none> 10h v1.27.8+IKS 10.240.11.62 10.240.11.62 Ubuntu 20.04.6 LTS 5.4.0-169-generic containerd://1.7.11
10.240.11.7 Ready <none> 61d v1.27.6+IKS 10.240.11.7 10.240.11.7 Ubuntu 20.04.6 LTS 5.4.0-165-generic containerd://1.7.7
10.240.128.23 Ready <none> 60d v1.27.6+IKS 10.240.128.23 10.240.128.23 Ubuntu 20.04.6 LTS 5.4.0-165-generic containerd://1.7.7
10.240.128.25 Ready <none> 60d v1.27.6+IKS 10.240.128.25 10.240.128.25 Ubuntu 20.04.6 LTS 5.4.0-165-generic containerd://1.7.7
10.240.128.63 Ready <none> 10h v1.27.8+IKS 10.240.128.63 10.240.128.63 Ubuntu 20.04.6 LTS 5.4.0-169-generic containerd://1.7.11
10.240.128.64 Ready <none> 10h v1.27.8+IKS 10.240.128.64 10.240.128.64 Ubuntu 20.04.6 LTS 5.4.0-169-generic containerd://1.7.11
10.240.128.66 Ready <none> 10h v1.27.8+IKS 10.240.128.66 10.240.128.66 Ubuntu 20.04.6 LTS 5.4.0-169-generic containerd://1.7.11
10.240.128.7 Ready <none> 61d v1.27.6+IKS 10.240.128.7 10.240.128.7 Ubuntu 20.04.6 LTS 5.4.0-165-generic containerd://1.7.7
ambikanair@Ambikas-MBP customer-notifications % kubectl get ns | grep openshift-storage
ambikanair@Ambikas-MBP customer-notifications % kubectl get ns
NAME STATUS AGE
default Active 61d
ibm-cert-store Active 61d
ibm-operators Active 61d
ibm-services-system Active 40d
ibm-system Active 61d
karpenter Active 40d
kube-node-lease Active 61d
kube-public Active 61d
kube-system Active 61d
ambikanair@Ambikas-MBP customer-notifications %
We dont have openshift-storage in IKS. Pasted above is 1.27 and 1.28 namespace list.
@kshithijiyer-px Does this PR has any impact https://github.com/portworx/torpedo/pull/1958
4.12.44_1571_openshift
You can now execute 'kubectl' commands against your cluster. For example, run 'kubectl get nodes'.
ambikanair@Ambikas-MBP customer-notifications % kubectl get ns | grep openshift-storage
ambikanair@Ambikas-MBP customer-notifications % kubectl get ns
NAME STATUS AGE
calico-system Active 61d
default Active 61d
ibm-cert-store Active 60d
ibm-odf-validation-webhook Active 61d
ibm-services-system Active 40d
ibm-system Active 61d
kube-node-lease Active 61d
kube-public Active 61d
kube-system Active 61d
openshift Active 61d
openshift-apiserver Active 61d
openshift-apiserver-operator Active 61d
openshift-authentication Active 61d
openshift-authentication-operator Active 61d
openshift-cloud-credential-operator Active 61d
openshift-cloud-network-config-controller Active 61d
openshift-cluster-csi-drivers Active 61d
openshift-cluster-machine-approver Active 61d
openshift-cluster-node-tuning-operator Active 61d
openshift-cluster-samples-operator Active 61d
openshift-cluster-storage-operator Active 61d
openshift-cluster-version Active 61d
openshift-config Active 61d
openshift-config-managed Active 61d
openshift-config-operator Active 61d
openshift-console Active 61d
openshift-console-operator Active 61d
openshift-console-user-settings Active 61d
openshift-controller-manager Active 61d
openshift-controller-manager-operator Active 61d
openshift-dns Active 61d
openshift-dns-operator Active 61d
openshift-etcd Active 61d
openshift-etcd-operator Active 61d
openshift-image-registry Active 61d
openshift-infra Active 61d
openshift-ingress Active 61d
openshift-ingress-canary Active 61d
openshift-ingress-operator Active 61d
openshift-insights Active 61d
openshift-kube-apiserver Active 61d
openshift-kube-apiserver-operator Active 61d
openshift-kube-controller-manager Active 61d
openshift-kube-controller-manager-operator Active 61d
openshift-kube-proxy Active 61d
openshift-kube-scheduler Active 61d
openshift-kube-scheduler-operator Active 61d
openshift-kube-storage-version-migrator Active 61d
openshift-kube-storage-version-migrator-operator Active 61d
openshift-machine-api Active 61d
openshift-machine-config-operator Active 61d
openshift-marketplace Active 61d
openshift-monitoring Active 61d
openshift-multus Active 61d
openshift-network-diagnostics Active 61d
openshift-network-operator Active 61d
openshift-node Active 61d
openshift-operator-lifecycle-manager Active 61d
openshift-operators Active 61d
openshift-roks-metrics Active 61d
openshift-route-controller-manager Active 61d
openshift-service-ca Active 61d
openshift-service-ca-operator Active 61d
openshift-user-workload-monitoring Active 61d
tigera-operator Active 61d
ambikanair@Ambikas-MBP customer-notifications % kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
10.240.11.50 Ready master,worker 7d18h v1.25.14+a52e8df 10.240.11.50 10.240.11.50 Red Hat Enterprise Linux 8.8 (Ootpa) 4.18.0-477.27.1.el8_8.x86_64 cri-o://1.25.5-2.rhaos4.12.git0217273.el8
10.240.11.64 Ready master,worker 10h v1.25.14+a52e8df 10.240.11.64 10.240.11.64 Red Hat Enterprise Linux 8.9 (Ootpa) 4.18.0-513.9.1.el8_9.x86_64 cri-o://1.25.5-2.rhaos4.12.git0217273.el8
10.240.128.38 Ready master,worker 47d v1.25.14+bcb9a60 10.240.128.38 10.240.128.38 Red Hat Enterprise Linux 8.8 (Ootpa) 4.18.0-477.27.1.el8_8.x86_64 cri-o://1.25.4-4.1.rhaos4.12.gitb9319a2.el8
10.240.128.46 Ready master,worker 7d18h v1.25.14+a52e8df 10.240.128.46 10.240.128.46 Red Hat Enterprise Linux 8.8 (Ootpa) 4.18.0-477.27.1.el8_8.x86_64 cri-o://1.25.5-2.rhaos4.12.git0217273.el8
10.240.128.67 Ready master,worker 10h v1.25.14+a52e8df 10.240.128.67 10.240.128.67 Red Hat Enterprise Linux 8.9 (Ootpa) 4.18.0-513.9.1.el8_9.x86_64 cri-o://1.25.5-2.rhaos4.12.git0217273.el8
ambikanair@Ambikas-MBP customer-notifications %
@kshithijiyer-px Does this PR has any impact portworx/torpedo#1958 ?
This code will only impact if you have changed the provisioner
value to cephfs-csi
if the value passed is ibm
we aren't seeing any issue. I am not hitting any issues in our runs as which we did after merging this PR.
One thing which I suspect is that the there is something wrong in the IBM provisioner which is causing this failure. You'll need to check the provisioner code for any new commits which is causing this issue.
You'll need to check the provisioner code for any new commits which is causing this issue.
- do you mean ibm-vpc-block-csi-driver provisoner code?
You'll need to check the provisioner code for any new commits which is causing this issue.
- do you mean ibm-vpc-block-csi-driver provisoner code?
Yes
@kshithijiyer-px I can check that. Could you pls explain why the scenario of ceph comes here when the provisioner provided is "ibm" during torpedo execution.
error getting secret rook-csi-cephfs-provisioner in namespace openshift-storage: secrets "rook-csi-cephfs-provisioner" not found
Is that some default provisioner that torpedo tests fall back to . Volume provisioning request is received by correct CSI driver as we can see vpc.block.csi.ibm.io
in the logs.
The call fails even before reaching vpc-block-csi driver. I checked csi-provisiooner logs
d4", APIVersion:"v1", ResourceVersion:"16215", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "postgres-sc": error getting secret rook-csi-cephfs-provisioner in namespace openshift-storage: secrets "rook-csi-cephfs-provisioner" not found
I0112 00:37:02.839116 1 reflector.go:281] sigs.k8s.io/sig-storage-lib-external-provisioner/v8/controller/controller.go:845: forcing resync
I0112 00:37:50.079933 1 reflector.go:559] k8s.io/client-go/informers/factory.go:150: Watch close - *v1.CSINode total 8 items received
I0112 00:39:31.311487 1 reflector.go:559] k8s.io/client-go/informers/factory.go:150: Watch close - *v1.VolumeAttachment total 6 items received
I0112 00:39:36.220710 1 reflector.go:559] k8s.io/client-go/informers/factory.go:150: Watch close - *v1.Node total 13 items received
I0112 00:40:20.040688 1 reflector.go:559] k8s.io/client-go/informers/factory.go:150: Watch close - *v1.PersistentVolumeClaim total 6 items received
I0112 00:41:05.933709 1 controller.go:1337] provision "postgres-csi-pxb-0-58075-01-11-03h57m29s/postgres-data" class "postgres-sc": started
I0112 00:41:05.934294 1 event.go:285] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"postgres-csi-pxb-0-58075-01-11-03h57m29s", Name:"postgres-data", UID:"4de8f722-2023-4f92-83ed-8bff9de3c1d4", APIVersion:"v1", ResourceVersion:"16215", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "postgres-csi-pxb-0-58075-01-11-03h57m29s/postgres-data"
W0112 00:41:05.950508 1 controller.go:934] Retrying syncing claim "4de8f722-2023-4f92-83ed-8bff9de3c1d4", failure 339
E0112 00:41:05.950604 1 controller.go:957] error syncing claim "4de8f722-2023-4f92-83ed-8bff9de3c1d4": failed to provision volume with StorageClass "postgres-sc": error getting secret rook-csi-cephfs-provisioner in namespace openshift-storage: secrets "rook-csi-cephfs-provisioner" not found
I0112 00:41:05.950640 1 event.go:285] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"postgres-csi-pxb-0-58075-01-11-03h57m29s", Name:"postgres-data", UID:"4de8f722-2023-4f92-83ed-8bff9de3c1d4", APIVersion:"v1", ResourceVersion:"16215", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "postgres-sc": error getting secret rook-csi-cephfs-provisioner in namespace openshift-storage: secrets "rook-csi-cephfs-provisioner" not found
I0112 00:41:15.149113 1 reflector.go:559] sigs.k8s.io/sig-storage-lib-external-provisioner/v8/controller/controller.go:848: Watch close - *v1.StorageClass total 6 items received
I0112 00:43:06.113649 1 reflector.go:559] sigs.k8s.io/sig-storage-lib-external-provisioner/v8/controller/controller.go:845: Watch close - *v1.PersistentVolume total 10 items received
So @ambiknai it looks like a IBM CSI-Provisioner issue right?
No @kshithijiyer-px . This is external-csi-provisioner sidecar.
So the error clearly says that it is unable to find the secret that is mentioned in the StorageClass Spec. We dont have openshift-storage
namespace in this cluster. This usecase does not fit for execution in the cluster I created. And as I mentioned the provider is "ibm" . So as per your comment, this usecase should not ideally get executed in IBM usecase is what I understand.
Hi @kshithijiyer-px The workaround you suggested looks good for IKS
1.29-IKS-torpedo.txt 1.25-iks-torpedo.txt 1.28-IKS-torpedo.txt
When I ran the suite against ROKS, test fails for diff reason. could you pls check once
@arahamad @sandaymin123 ^^
@ambiknai We are trying to integrate ROKS into our pipeline. We are facing issues in integrating this. There might be some CSI issues but can't confirm.
There are no torpedo issues in these: We had to do a workaround to fix IKS execution - @kshithijiyer-px is working with framework team to get a patch release so that there is a permanent fix. [Thontesh] There is no torpedo issues as I mentioned before multiple times. There is security issue in ROKS which broke our PG app spec. Hence, we moved to mysql app. This requires changes in PG app spec public repo.
We are blocked on ROKS test execution. - @kshithijiyer-px confirmed that this is not a product issue but could be some thing related to torpedo. We are waiting for further updates. [Thontesh] There is no Torpedo issue again here also. Lets wait for update from @kshithijiyer-px
Hello @ambiknai We did an internal run where we aren't hitting the issue. Please find the log attached: console.txt
Here we are running all the tests and we see that the size issue isn't hit.
Can you share some more details on what is your ROKS configuration you are running with?
We usually run out jobs with machine instance type as bx2.4x16
can you check and confirm you are running with the a similar spec instance or higher?
One other noticeable thing which I see is that our "--driver-start-timeout", "30m0s",
where are yours is "--driver-start-timeout", "20m0s",
Can you do these changes and let us know if you are still seeing the issue?
@kshithijiyer-px I see test cases have failed in the logs shared
[2024-01-18T15:51:53.386Z] [36mINFO[0m[2024-01-18 15:51:53] ------------------------
[2024-01-18T15:51:53.386Z] 2024-01-18 15:51:53 +0000:[INFO] [{AddMultipleNamespaceLabels}] [tests.EndPxBackupTorpedoTest:#7261] - >>>> FAILED TEST: {AddMultipleNamespaceLabels} Add multiple labels to namespaces, perform manual backup, schedule backup using namespace label and restore
[2024-01-18T15:51:53.642Z] 2024-01-18 15:51:53 +0000:[INFO] [{AddMultipleNamespaceLabels}] [tests.DeleteAllNamespacesCreatedByTestCase:#10193] - Deleting namespace [mysql-ibm-pxb-0-85583-85583-01-18-15h49m27s]
yeah.. true that size issue is not seen.
--flavor bx2.4x16
4.13.23_openshift
Please share ROKS version tried from your end.
@ambiknai it's the first run which we ran we are still looking at the failures, we just wanted to make sure if we were also seeing the same size issue which we aren't seeing in our runs.
@trenukarya-px @kshithijiyer-px IKS 2.6.0 verification completed
1.29-IKS-torpedo.txt 1.25-iks-torpedo.txt 1.28-IKS-torpedo.txt
As per the discussion, we could not do ROKS torpedo execution due to automation stability issues( automation not qualified for ROKS). IBM torpedo tests for ROKS are hence failing. Also for future releases, can we expect automation fix in place for ROKS?
Could you pls attach successful ROKS test result from your side .
After reviewing our IKS test results and considering the successful test execution carried out by PX Team on ROKS, we collectively affirm our decision to provide the sign-off for this version.
cc: @arahamad
Thanks @kshithijiyer-px @ambiknai !!
We are good with uploading PX Backup 2.6.0 to Catalog based on all these results.
Uploading few more results:
roks-run-PXE.txt
We have validated on prod catalog
[root@ip-10-13-8-162 ~]# helm ls -A NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION px-backup pxb260 1 2024-01-25 05:50:42.331319549 +0000 UTC deployed px-central-2.6.0 2.6.0 [root@ip-10-13-8-162 ~]# [root@ip-10-13-8-162 ~]# [root@ip-10-13-8-162 ~]# kubectl -n pxb260 get po NAME READY STATUS RESTARTS AGE px-backup-79dcff5d6b-959xc 1/1 Running 1 (4h9m ago) 4h10m pxc-backup-mongodb-0 1/1 Running 0 4h10m pxc-backup-mongodb-1 1/1 Running 0 4h10m pxc-backup-mongodb-2 1/1 Running 0 4h10m pxcentral-apiserver-85764cdf7-7nllq 1/1 Running 0 4h10m pxcentral-backend-7469654777-vjbzb 1/1 Running 0 4h7m pxcentral-frontend-67d8b88bbc-m2jqz 1/1 Running 0 4h7m pxcentral-keycloak-0 1/1 Running 0 4h10m pxcentral-keycloak-postgresql-0 1/1 Running 0 4h10m pxcentral-lh-middleware-79cbc49f55-fd8nd 1/1 Running 0 4h7m pxcentral-mysql-0 1/1 Running 0 4h10m pxcentral-post-install-hook-r5lb9 0/1 Completed 0 4h10m [root@ip-10-13-8-162 ~]#
PXB Sign-off-Template Following are the tasks which has to be completed by following teams
NOTE: This issue should not be closed until PX-Backup build/version pushed to IBM Cloud Catalog
Sign-off process Step 1:
Portworx Team:
IBM Team:
Sign-off process Step 2:
NOTE: Once above tasks completed and marked as done including sign-off then PX-Backup and IBM Team need to complete following tasks as well
PX-Backup Team:
IBM Team: