Open DnPlas opened 4 months ago
@DnPlas can you give more context? afaict 178 is failing for legitimate reasons. That charm is hitting error during a relation-changed event, not blocked.
In principle I think I'm with you about how a unit test is sufficient to test if we go Blocked
when missing a relation. But now I'm wondering if by removing this wait_for_idle
we end up hiding another error entirely.
@DnPlas can you give more context? afaict 178 is failing for legitimate reasons. That charm is hitting error during a relation-changed event, not blocked.
Yes, the reason why #178 is failing (and potentially any other CI) is because the kubeflow-profile-controller
is in maintenance
mode, in fact Waiting for pod startup to complete
, which can cause the kubeflow-dashboard
to fail with hook failed: "kubeflow-profiles-relation-changed"
because the dashboard charm depends on the profiles info (see here). By deploying both kubeflow-dashboard
and kubeflow-profiles
at the same time and wait for idle
for BOTH charms, we avoid this.
In principle I think I'm with you about how a unit test is sufficient to test if we go
Blocked
when missing a relation. But now I'm wondering if by removing thiswait_for_idle
we end up hiding another error entirely.
It's still there, we are actually waiting for BOTH charms to be active and idle with a timeout of 600s.
The CI is terminating because kubeflow-dashboard is in error:
kubeflow-dashboard/0* error idle 10.1.34.139 hook failed: "kubeflow-profiles-relation-changed"
juju.errors.JujuUnitError: Unit in error: kubeflow-dashboard/0
It should not be possible that this sort of error be caused by kubeflow-profiles in any way (whether it is in maintenance mode, etc). So what I'm saying is that the the CI is catching a legitimate error here where kubeflow-dashboard fails to handle the relation-changed event, and changing the CI to suppress it is not what we should do
Thanks @ca-scribner for the review, agreed on the model config setting. Will update the PR shortly.
Remove the assertion that tests the kubeflow-dashboard-operator goes to Blocked when the relation with kubeflow-profiles is not there. This assertion is already covered in unit tests, and it is only adding noise at deployment time.
This commit ensures the dependencies like charms and relations are deployed on time which will give time for everything to be set up before moving ahead with other test cases.