kubeflow / spark-operator

Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Apache License 2.0
2.74k stars 1.36k forks source link

Access to the StreamingQuery status through the ingress #1772

Closed nickpodobiedov closed 1 week ago

nickpodobiedov commented 1 year ago

We use SparkOperator and have an issue with access to the detailed stream statistics.

  1. Operator configuration: '-ingress-url-format=http:///spark/{{$appName}}'
  2. Sparkapplication configuration:
    spark.ui.proxyBase: /spark/pyspark-realtime
    spark.ui.proxyRedirectUri: /spark/pyspark-realtime
    spark.ui.reverseProxyUrl: /spark/pyspark-realtime

    I can get access to the all application data from UI, for example: http:///spark/pyspark-realtime/jobs/ e.t.c. Also I can get Structured Streaming page: image Its URL is https:///spark/pyspark-realtime/StreamingQuery/. But when I want to see details for each stream, I got redirect to the https:///StreamingQuery/statistics/?id= without needed prefix. I think spark.ui.proxyRedirectUri param is responsive for this, but I cannot rewrite it, even with my configuration (when I setup this parameter directly) I get the spark.ui.proxyRedirectUri=/ in driver configmap. Thank you for help!

puneetloya commented 1 year ago

I was just looking into it. This is the piece of code that does this: https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob/5f2efd4ff97e7c0bfdb726a066118d3401576730/pkg/controller/sparkapplication/controller.go#L699-L705

That is the reason your settings are being overwritten. If you want you can disable this by using --ingress-url-fomrat but then you may have to create the ingress yourself.

nickpodobiedov commented 1 year ago

I was just looking into it. This is the piece of code that does this:

https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob/5f2efd4ff97e7c0bfdb726a066118d3401576730/pkg/controller/sparkapplication/controller.go#L699-L705

That is the reason your settings are being overwritten. If you want you can disable this by using --ingress-url-fomrat but then you may have to create the ingress yourself.

Yes I also saw this. But may be it is possible to fix this in code. Also yes, it is possible to create Ingress with the helm you use for SparkApplication creation.

github-actions[bot] commented 1 month ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] commented 1 week ago

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.