Closed rene-oromtz closed 2 months ago
We will publish a KT for this "Known issue with wd-discovery-ranker-rest in CPD 4.6.6" or somesuch. Watson Discovery has been problematic since it's introduction to the dependency stack, and is due to be removed from it in an upcoming release.
DT and Techdoc available at: https://www.ibm.com/support/pages/node/7149820
Thanks for the update @istrate @durera!
@istrate I think I just overlooked, is it just me or in https://www.ibm.com/support/pages/node/7149820 the patch is cropped? Not sure if its in my browser but the patch looks as follows:
oc patch wd wd --type=merge --patch='{"spec": {"wire": {"rankerRest": {"image": {"tag":"20240223-012927-645-b14d8a8","digest":"sha256:89e2c3efffbb06eb1605fb6b3b550ca7e0a41bc88f9e0a0d78c5
However in the know issue DT381121 I do see the full patch:
oc patch wd wd --type=merge --patch='{"spec": {"wire": {"rankerRest": {"image": {"tag":"20240223-012927-645-b14d8a8","digest":"sha256:89e2c3efffbb06eb1605fb6b3b550ca7e0a41bc88f9e0a0d78c5727a54ff9635"}}}}}'
@rene-oromtz Thanks! I have updated that with the correct patch command.
Summary
wd-discovery-ranker-rest keeps getting on crash loop back off state, WD keeps in InProgress status. From inside the pod, there is a JIT COMPILER CRASH WITH VMSTATE=0x00040000
Steps to reproduce
What is the current bug behavior?
WD is on Ready state in ROKS cluster but not all deployments are healthy
What is the expected correct behavior?
WD should be Ready and all deployment should have the necessary pod availability
Relevant logs and/or screenshots
CPD Version:
WD
WD Deployed Components:
WatsonDiscoveryWireCR:
WD Ranker Rest Image:
WD Ranker Rest Logs: full log: wd-discovery-ranker-rest-7c8894444c-l67mn-wd-discovery-ranker-rest.log
WD deployments:
This issue seems to be only relevant for ROKS cluster.
This issue has been originally opened against CPD for data team, however, CPD team found the problem was with the image for ranker-rest and dev was able to fix it by patching this image with the latest for 4.6.6:
CPD team mentioned this workaround be documented on CPD Docs, as this issue is not reproducible in other clusters by using CPD install procedure, so this might be directly related with Maximo and the images used during installation.
As the correct image for the ranker-rest already was provided, I'm wondering if this workaround can be documented in Maximo side and which place should be ideal for this purpose. Maximo Assist Troubleshooting section might be suitable for this purpose as this was encounter during Assist configuration.