swift-server / swift-prometheus

Prometheus client library for Swift
https://swiftpackageindex.com/swift-server/swift-prometheus
Apache License 2.0
145 stars 31 forks source link

move labels equality check out of critical section in summary/histogram #61

Closed avolokhov closed 3 years ago

avolokhov commented 3 years ago

This PR moves Labels equality check outside of Lock. The rationale behind it is following: App's metrics list is usually bound and don't grow a lot after some time of the app being operational. The code already has this constraint implicitly by the fact that none of Prometheus.metrics, Histogram.subHistograms, Summary.subSummaries, Counter.metrics are bounded, and will eventually cause an OOM if grow indefinitely. From the domain standpoint, unique label names and values are discouraged and don't provide a lot of insight to app's behaviour. With the above constraint we know that after some time of app being operational, all or most of label key-value pairs are already stored in corresponding Histogram.subHistograms / Summary.subSummaries. This makes getting a subSummary/subHistogram from the map a target for optimisations: getOrCreate flow becomes highly skewed towards get part. Suggested change improves performance of get part of getOrCreate and increases the cost less frequently used create part. On my local machine this results in more than 2x throughput boost for the following test setups:


    func testConcurrent() throws {
        let prom = PrometheusClient()
        let histogram = prom.createHistogram(forType: Double.self, 
                                             named: "my_histogram",
                                             helpText: "Histogram for testing",
                                             buckets: Buckets.exponential(start: 1, factor: 2, count: 63),
                                             labels: DimensionHistogramLabels.self)
        let elg = MultiThreadedEventLoopGroup(numberOfThreads: 8)
        let semaphore = DispatchSemaphore(value: 2)
        let time = DispatchTime.now()
        elg.next().submit {
            while DispatchTime.now().uptimeNanoseconds - time.uptimeNanoseconds < 10_000_000_000 {
                let labels = DimensionHistogramLabels([("myValue", "1")])
                histogram.observe(1.0, labels)
            }
            semaphore.signal()
        }
        elg.next().submit {
            while DispatchTime.now().uptimeNanoseconds - time.uptimeNanoseconds < 10_000_000_000 {
                let labels = DimensionHistogramLabels([("myValue", "2")])
                histogram.observe(1.0, labels)
            }
            semaphore.signal()
        }
        semaphore.wait()
        try elg.syncShutdownGracefully()
        print(histogram.collect())
        print(DispatchTime.now().uptimeNanoseconds - time.uptimeNanoseconds)
    }

Checklist

Motivation and Context

This PR optimises metric emission, which results in higher throughput and lower impact on running system.

Description

scope of locking is now reduced to only obtain/modify current value of PromHistogram.subHistograms / PromSummary.subSummaries If subHistogram/subSummaries for given labels is not yet added to the corresponding PromHistogram / PromSummary, the lock will be held twice: to get current value and to re-check + store new value.