LucaCanali / sparkMeasure

This is the development repository for sparkMeasure, a tool and library designed for efficient analysis and troubleshooting of Apache Spark jobs. It focuses on easing the collection and examination of Spark metrics, making it a practical choice for both developers and data engineers.
Apache License 2.0
690 stars 144 forks source link

Added connection & read timeout configuration for pushgateway sink #61

Closed Arnovsky closed 1 month ago

Arnovsky commented 2 months ago

Hey, this PR adds the capability to configure the timeout for the PushGateway http connection, we have observed cases where large spark jobs (hundreds - thousands) of stages are lagging due to the generous 5000ms default timeout, our aim is to allow this behaviour to be configurable.

I also fixed some inconsistencies with the formatting, please let me know if this is an issue and you'd like me to revert.

Arnovsky commented 2 months ago

There are other issues that are related to the pushgateway implementation, no keep-alive on the connection, no connection pooling, no batching, we can fix these as well, but not in the context of this PR.

Arnovsky commented 1 month ago

@LucaCanali Is it possible to get a review? Thanks!

LucaCanali commented 1 month ago

Thank you @Arnovsky for your contribution.