Open markeijsermans opened 6 years ago
Thanks for opening this issue to start the dicussion! :slightly_smiling_face:
My feeling is that we would probably do more harm if we add it, as most people don't understand that summaries are meant for cases where you can't control latencies at all, and would just enable it without understanding what they're doing.
I understand your case and I think what you're doing makes sense in your situation, and you understand that the histogram is actually the appropriate use, but I think I'd prefer people do it as explicitly as you and really understand the implications that has.
I'd still like to hear @Bplotka's thoughts as well.
I have mixed feelings, I agree with both @brancz arguments and @markeijsermans motivation. What about doing.. both?
Instead, of adding explicit Method for EnableHandlingTimeSummary
because users may shoot themselves in the foot if they are not careful, maybe we should just accept custom serverReporters
and clientReporters
so advanced users like @markeijsermans can write down summary
metric if needed? That would allow for more flexibility maybe.
On the other hand, Prometheus has summaries
for some reason, so maybe we should actually trust people to understand the differences between this and histograms? Or do you @brancz think that users will use it extensively and blame Prometheus for not being efficient?
What do you think about adding just extension like custom reporters? That would unblock you @markeijsermans?
I think extensibility would be a good compromise.
My understanding of the hesitance to directly support summary metrics is that it is less flexible and performant. The quantiles are fixed and need to be computed for each observation. I agree with your reasoning. An extension point works for us.
@Bplotka if I understand you correctly, it would look something like this
ServerReporter
interface
type ServerReporter interface {
ReceivedMessage()
SentMessage()
Handled(code codes.Code)
}
WithCustomServerReporter(s ServerReporter)
customServerReporter
in the same fashion as the defaultYup, something like this.
pt., 27 lip 2018, 20:24 użytkownik Mark Eijsermans notifications@github.com napisał:
My understanding of the hesitance to directly support summary metrics is that it is less flexible and performant. The quantiles are fixed and need to be computed for each observation. I understand and agree with your reasoning. An extension point works for us.
@Bplotka https://github.com/Bplotka if I understand you correctly, it would look something like this
- create ServerReporter interface
type ServerReporter interface { ReceivedMessage() SentMessage() Handled(code codes.Code) }
- add an extension function WithCustomServerReporter(s ServerReporter)
- update interceptors to handle the customServerReporter in the same fashion as the default
- same for the client side
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/grpc-ecosystem/go-grpc-prometheus/issues/57#issuecomment-408501004, or mute the thread https://github.com/notifications/unsubscribe-auth/AGoNu5WuYOFDzvxPO9TEdY0GhnlgGA7Dks5uK1rrgaJpZM4VibNN .
I've POC'd adding summary metrics in a similar way as histograms. It's enabled via a similar method
EnableHandlingTimeSummary()
. Is there interest in me submitting a PR?Motivation: Our monitoring situation is in transition as we move to prometheus/grafana. We currently export metrics to a hosted solution via a prometheus remote adapter. In that hosted solution calculating a p99 from histogram buckets is not possible (normally you would use
histogram_quantile()
)