kubeflow / spark-operator

Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Apache License 2.0
2.79k stars 1.38k forks source link

how works cache or persist with Spark Operator #2026

Open masalinas opened 6 months ago

masalinas commented 6 months ago

Please describe your question here

I'm using spark operator in minikube + minio to send some SQL distributed queries over CSV 2.4GB files with 8883 lines with 20000 columns each one and recovering 8883 samples with only two columns.

I would like cache or persist this queries. my question is, If the runners and driver are delete after fisnish, how can I cache or persist this queries from operator?

github-actions[bot] commented 3 months ago

This issue has been automatically marked as stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days. Thank you for your contributions.