Closed wg102 closed 1 year ago
When trying to update we encountered the argocd-prod 401 error again. To help that souheil, restarted the application controller statefulset. It seems to have helped for now.
After having merged the https://github.com/StatCan/aaw-kubeflow-manifests, we tried to see if it synced, on the Kubeflow application, but since it refers to the argocd which on the jsonnet refers on the kubeflow manifest. So instead we have to update each individual application. By forcing the sync on all out of sync app, we managed to get most of them up and running.
Knative sync failed: Deleted the resource. Then forced sync to recreate it.
Kserve sync failed: Manually synced the crd first, then the manifest so it stopped complaining about the crd not existing
Knative sync resolved by recreating the validation/mutating webhook resources Kserve resolved by updating the CRDs first then syncing the other resources.
Edit:
The central dashboard image was not updated correctly in aaw-kubeflow-manifests
, neither in dev nor in prod. For prod, a good strategy would be to update the image to the last PR image successfully built in the last sprint. This prevents work that's not been tested/used from possibly trickling down to prod.
Based of the ticket for dev https://github.com/StatCan/aaw/issues/1729 and the subsequent ticket of issues https://github.com/StatCan/aaw/issues/1752
Common
v1.7.0
v1.7.0
v1.7.0
v1.7.0
I think anything that is not direct folder equivalent is in the knative folder
Apps
v1.7.0
v1.7.0
v1.7.0
v1.7.0
v1.7.0
v1.7.0
v1.7.0
Contrib
v1.7.0
v1.7.0
v1.7.0
Other issues that had to be fixed
Run the script to clear the podefaultsUpdate the manifest and charts https://github.com/StatCan/aaw/issues/1789NAMESPACES=$(kubectl get namespaces --no-headers | awk '{print $1}')
for ns in $NAMESPACES do kubectl delete poddefaults protected-b -n $ns done