Closed grobbie closed 1 year ago
The code for a workaround - copy the spark-k8s service account from kubeflow
namespace:
kubectl get sa spark-k8s -n kubeflow -o yaml | sed 's/namespace: kubeflow/namespace: <NEW>/' | kubectl create -f -
Replace
This workaround will only work for anyone who has access to the kubeflow namespace, so an admin or similar would have to apply it for general users.
Hi @Barteus this is something we'd like to fix by providing the right RBAC to users, which is something the charm code has to figure out. If we copy the sa into each user namespace we are providing access to resources and actions that are not necessarily related to spark, and thus a better workaround would be to create sa with just access to spark objects (those defined in the CRDs). We can leave this bug open until we provide a good fix.
Do we know which of the following is the issue?
spark-k8s
is required in the user namespace, but that we currently don't have oneSparkApplication
or ScheduledSparkApplication
)? I ask because, unless the problem is (1), I'm not sure how copying the ServiceAccount
alone fixes.
If this is actually an RBAC issue (2), I agree with @dnplas that we can be a bit more specific with our fix (and I'm confused how copying the SA alone would fix anything, but I might be misunderstanding something). If we need RBAC assigned to users, I think the appropriate way to give users this RBAC would be either to:
a. (if deploying spark on its own) make role bindings and a service account for these permissions and put them in spark's namespace
b. (if deploying alongside kubeflow, enabling all kubeflow users to use spark) create ClusterRole
s for with the desired user RBAC and use Kubeflow's role aggregation procedure to get them attached to all users.
To do (b), we create ClusterRole
s with the rbac.authorization.kubeflow.org/aggregate-to-kubeflow-*
label as we do here. Ideally, these ClusterRoles would be created and managed by the charm operator itself, but I think that the barrier for pod spec charms (including this Spark Operator) is that pod spec does not let us create arbitrary ClusterRoles. As a workaround, we created the kubeflow-roles-operator to (if I recall correctly...) manage roles for any legacy pod spec charms, and as we decommissioned those charms we'd pull those roles back out.
So assuming this is an RBAC issue, I see two possible actions:
ClusterRole
to the kubeflow-roles-operatorClusterRole
creation into the charm codeSpark integration is not supported. A new design is being introduced and integration with Spark will be different. This will need to be revisited and spec'ed out from the beginning. Closing.
When deploying the operator alongside Charmed Kubeflow, the service account is correctly created in the
kubeflow
namespace. However it is not created in the user's own namespace. This means sparkapplications do not run when created in the user's namespace.As I see it, we need to either (a) have an alternative service account already created with the correct permissions in the user's namespace ready to go, (b) automatically create the spark service account in users' namespaces when the spark-k8s operator is deployed (which looks like a problematic can of worms to me) or (c) provide instructions to the user about creating a suitable service account (least desirable option).