stackabletech / spark-k8s-operator

Operator for Apache Spark-on-Kubernetes for Stackable Data Platform
https://stackable.tech
Other
51 stars 3 forks source link

Support syncing PySpark applications from Git #372

Open sbernauer opened 6 months ago

sbernauer commented 6 months ago

In case you have multiple Python files that form an application it is inconvenient to put everything in a ConfigMap. It would be nice if you could use the same mechanism as implemented in Airflow using gitsync.

Related to https://github.com/stackabletech/opa-operator/issues/504, which implements the same thing for OPA regorules.

### Tasks
- [ ] Try to move commons code  into operator-rs
iscsrwm commented 4 months ago

As a workaround you can use an initContainer in your podOverrides.spec to sync your code. Here is an example:

driver: podOverrides: spec: initContainers:

In this example your code would be available in your driver container at /common/my_app