Open Jimmy-Newtron opened 4 years ago
Please sign the Google CLA for your PR. I will take a look after you sign it. Thanks!
Please sign the Google CLA for your PR. I will take a look after you sign it. Thanks!
CLA signed
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
As a devops installing this operator, I have put in place prometheus metrics collection.
While having a look into the metrics I discovered that counters are available only once a SparkApplication is taking a specific status (RUNNING, FAILED, SUBMITTED, ...)
Actually I wanted to compute the failure ratio as FAILURES / (SUCCESS + FAILURES) Unfortunately this ratio has NaN result due to the missing FAILURES metric (that I expected initialized to 0)
As per prometheus client documentation: https://github.com/prometheus/client_golang/blob/master/prometheus/counter.go#L194 You can initialize counters to 0 by invoking GetMetricWithLabelValues(labels)
I have tried to create a PR where you can have a look and get inspiration: https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/pull/903/files
Another thing I remarked is that the metric sparkAppFailedSubmissionCount is not registered and so not available in the exposed metrics.
Hope this ticket is clear enough. Have a nice day.
Cheers, Jimmy