-
Operator deployed using:
```
apiVersion: helm.fluxcd.io/v1
kind: HelmRelease
metadata:
name: google-spark-operator
namespace: kube-system
spec:
chart:
repository: https://googlecl…
-
k8s: GKE 1.17 with auto-scaling node-pool with taint
spark-operator: v1beta2-1.2.0-3.0.0
SparkApp:
dynamicAllocation: enabled
tolerations and nodeSelector: specified
Sometimes executors get…
-
### Issues Policy acknowledgement
- [X] I have read and agree to submit bug reports in accordance with the [issues policy](https://www.github.com/mlflow/mlflow/blob/master/ISSUE_POLICY.md)
### Willi…
-
### Description
We are replacing presto in the system with velox, but we have done a lot of self-research and optimization on performance based on presto. We found that there is a performance gap b…
-
Hi,
I'm adding the operator and sparkapplication helm charts to my auto deploy script, among other components (e.g. tomcat, zookeeper, etc). Since sparkapplication crds are defined by the operator,…
-
This is probably mostly the same as https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/594 which has since been marked as closed, however, the solution described there doesn't seem to…
-
ETA: 2024-06-30
We want to use IPv4 addresses of SPARK nodes as the scarce resource that makes it expensive for a single party to run many nodes. ATM, we rely on the trusted spark-api service to re…
-
cc @Fokko .
This is a super simple implementation of an iceberg client for dask. It works for the limited couple of datasets I have available including
- version metadata choice
- snapshot time …
-
Trying to run a spark operator, i am using the `pi.py` file and `spark-py-pi.yaml` files
```
import sys
from random import random
from operator import add
from pyspark.sql import SparkSession…
-
If we use the `prometheus operator`, we can easily configure the target pod we want to collect metrics for using `pod monitor` crd and label selector and deploy the `prometheus server`.
The approxima…