lchenn / py-grpc-prometheus

Python gRPC Prometheus
Apache License 2.0
48 stars 25 forks source link

"ValueError: Duplicated timeseries in CollectorRegistry" in latest version (v0.6.0) #25

Open Tiernan-Stapleton opened 3 years ago

Tiernan-Stapleton commented 3 years ago

The latest version v0.6.0 is raising the following error by prometheus_client/registry.py, when using the PromClientInterceptor()

"ValueError: Duplicated timeseries in CollectorRegistry: {'grpc_client_started_created', 'grpc_client_started', 'grpc_client_started_total'}"

I think I've narrowed down the issue to being a side effect of changes from https://github.com/lchenn/py-grpc-prometheus/pull/21, specifically https://github.com/lchenn/py-grpc-prometheus/pull/21/files#diff-04b7eb91d0278b63925e45233c6beffcf6d3f4acb166c847d08982268c8cf111L4 where the metrics are changed from being defined at the file level to being defined per interceptor instance via init_metrics(). This is causing duplication of metrics registration when the PromClientInterceptor is defined multiple times to intercept different GRPC calls.

I've included a simple test which reproduces the error:

import grpc
from py_grpc_prometheus.prometheus_client_interceptor import PromClientInterceptor

class TestMetric:
    def test__broken_metrics(self):
        a = grpc.intercept_channel(PromClientInterceptor())
        b = grpc.intercept_channel(PromClientInterceptor())

when run

$ pytest test_file.py 
================================================ test session starts ================================================
platform linux -- Python 3.8.6, pytest-6.2.2, py-1.9.0, pluggy-0.13.1
rootdir: /code
plugins: requests-mock-1.8.0, mock-3.5.1, grpc-0.8.0, cov-2.11.1, optional-tests-0.1.1, flake8-1.0.7
collected 1 item                                                                                                    

test_file.py F                                                                                                [100%]

===================================================== FAILURES ======================================================
__________________________________________ TestMetric.test__broken_metrics __________________________________________

self = <test_file.TestMetric object at 0x7f0cdfdf5040>

    def test__broken_metrics(self):
        a = grpc.intercept_channel(PromClientInterceptor())
>       b = grpc.intercept_channel(PromClientInterceptor())

test_file.py:7: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/code/venv/lib/python3.8/site-packages/py_grpc_prometheus/prometheus_client_interceptor.py:31: in __init__
    self._metrics = init_metrics(registry)
/code/venv/lib/python3.8/site-packages/py_grpc_prometheus/client_metrics.py:6: in init_metrics
    "grpc_client_started_counter": Counter(
/code/venv/lib/python3.8/site-packages/prometheus_client/metrics.py:121: in __init__
    registry.register(self)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <prometheus_client.registry.CollectorRegistry object at 0x7f0cdfe2ea30>
collector = prometheus_client.metrics.Counter(grpc_client_started)

    def register(self, collector):
        """Add a collector to the registry."""
        with self._lock:
            names = self._get_names(collector)
            duplicates = set(self._names_to_collectors).intersection(names)
            if duplicates:
>               raise ValueError(
                    'Duplicated timeseries in CollectorRegistry: {0}'.format(
                        duplicates))
E               ValueError: Duplicated timeseries in CollectorRegistry: {'grpc_client_started_total', 'grpc_client_started', 'grpc_client_started_created'}

/code/venv/lib/python3.8/site-packages/prometheus_client/registry.py:29: ValueError
============================================== short test summary info ==============================================
FAILED test_file.py::TestMetric::test__broken_metrics - ValueError: Duplicated timeseries in CollectorRegistry: {'...
================================================= 1 failed in 0.12s =================================================
lchenn commented 3 years ago

Thanks for reporting. What if you just create one PromClientInterceptor() and use that instance to intercept the channel?

Tiernan-Stapleton commented 3 years ago

Unfortunately that isn't possible in my use case. I'm using multiple gRPC connections to different services, each which require their own instance of the PromClientInterceptor(). Since v0.6, whenever I run my code, the second gRPC service connection will fail to initialise with the error above.

lchenn commented 3 years ago

What I am suggesting is make a single instance of "PromClientInterceptor", then you can used it in multiple clients.

.e.g.

PROM_CLIENT_INTERCEPTOR = PromClientInterceptor()) a = grpc.intercept_channel(PROM_CLIENT_INTERCEPTOR) b = grpc.intercept_channel(PROM_CLIENT_INTERCEPTOR)

Would that work?

Tiernan-Stapleton commented 3 years ago

That will work, however my issue with it as a solution is that that is a non-standard way to implement an interceptor with prometheus. Also, there is nothing stopping someone else coming across the same issue as I have. Moving the metrics definition back to the file level solves the problem, and allows both of these ways to define an interceptor to be usable.

I can open a PR making the required changes to fix this issue when I get the time over the next few days, if changing the metric definitions back to file level is OK with you?

elliottc commented 2 years ago

+1, I ran into this error as well.