Closed phoevos closed 1 year ago
As part of the charm rewrite we changed the value of the --namespace
option provided when starting the persistence agent service from ""
to match the model namespace value. The config is set up here:
https://github.com/canonical/kfp-operators/blob/5806a6be8b0ca4111e33b9077ee1c245acbbdc01/charms/kfp-persistence/src/charm.py#L68
And used here:
https://github.com/canonical/kfp-operators/blob/5806a6be8b0ca4111e33b9077ee1c245acbbdc01/charms/kfp-persistence/src/components/pebble_components.py#L54
However, this won't work for multi-user Kubeflow installations. Looking into the upstream manifests it's clear that this value is intended to be empty.
Bug Description
The status of submitted KFP runs is never updated and is therefore stuck to
None
in thelatest/edge
version of the KFP charms.The Argo Workflow is submitted properly and executes successfully, but that doesn't reflect on the run itself, as seen using either the KFP client (which returns a finished run with a None status) or the KFP UI (which hangs loading).
This is likely a bug introduced in our recent sidecar rewrites. At the moment it's just a guess, but I'm thinking that this has something to do with the KFP Persistence Agent not working properly, given that the content of the KFP MySQL DB is never updated with the completed workflow.
To Reproduce
kfp-db
and verify that the workflow saved as part of the created run entry does not have an updated statusitsEnvironment
latest/edge
1.24/stable
2.9/stable
Relevant log output
Additional context
I noticed that before the rewrite we were applying this ServiceAccount to the PersistenceAgent container, which allowed for accessing the Argo Workflow K8s resources: https://github.com/canonical/kfp-operators/blob/67235bfa402fb4f67c30521fad431c467a1b0d44/charms/kfp-persistence/src/charm.py#L62-L85
It doesn't look like we're currently applying these elsewhere. This shouldn't be an issue here, since we're deploying the charm with trust, but we should also make a note of deploying the required upstream ClusterRole and Binding.