Open jrhee17 opened 1 year ago
Thanks @jrhee17 for creating this issue!
I agree that option 1 is the most reasonable option. Here are some of my suggestions on option (and please FFTI):
existingConnectionAcquisitionDurationNanos
is a good name. I'd also suggest renaming connectionAcquisitionDurationNanos
to newConnectionAcquisitionDurationNanos
. This way, the name is the most clear.ClientConnectionTimings
is probably considered as a public API. So, maybe consider this renaming as part of the next major release. But before that happens, it'd be very nice to update all Javadocs/comments about connectionAcquisitionDurationNanos
to clearly point out that this duration is only about acquiring/creating a NEW connection.I'd also suggest renaming connectionAcquisitionDurationNanos to newConnectionAcquisitionDurationNanos. This way, the name is the most clear.
Rather, I was thinking that existingConnectionAcquisitionDurationNanos
would be included in connectionAcquisitionDurationNanos
.
existingConnectionAcquisitionDurationNanos > 0
pendingAcquisitionDurationNanos > 0
socketConnectDurationNanos > 0
Essentially, connectionAcquisitionDurationNanos = existingConnectionAcquisitionDurationNanos + pendingAcquisitionDurationNanos + socketConnectDurationNanos
so semantically I was thinking the naming is fine as-is.
Note: @ikhoon also suggested that we change the naming of the metric collected in RequestMetricSupport
:
https://github.com/line/armeria/blob/f2116739018c15f230dd8f0e5a4dd5db1fefc888/core/src/main/java/com/linecorp/armeria/internal/common/metric/RequestMetricSupport.java#L289-L296
He suggested that we use tags to differentiate connection acquisition
e.g.
connection.acquisition.duration
acquisition_type
new
piggyback
(pending
might be more appropriate)reuse
or existing
To avoid breaking changes, we may just add another metric at the cost of extra metric exported. (and deprecate the old metric. Deprecation may be propagated via javadocs and documentation to be added)
We should also update documentation since the relationship between values in ConnectionTimings
and the metric names may not be clear.
If the default metrics are more accurately expressed/exposed by RequestMetricsSupport
, I think it will be possible to reduce the hassle of creating custom metrics by accessing ClientConnectionTimings
.
The new metric will provide more accurate information about connection acquisition.
Currently, client-side provides some timing information when a new connection is created. https://armeria.dev/docs/advanced-structured-logging#client-connection-timing-properties
This information is exposed with the naming
connectionAcquisitionDurationNanos
. However, these metrics are recorded only for cases where a new connection is created.It has been pointed out that the naming is confusing since this metric isn't recorded when an existing connection is used. We may do one of the following:
Add a new timing to
ClientConnectionTimings
existingConnectionAcquisitionDurationNanos
Reuse
pendingAcquisitionDurationNanos
Don't define a separate timing and just end the timings
I propose we go with option 1 might be simpest/least prone to error. I think we also want to expose this metric to
RequestMetricSupport
by default.Note: As pointed out by @ikhoon, doing this would also fix a bug where dns timing information wasn't collected if there is a healthy connection already available.
ref: https://line-armeria.slack.com/archives/C1NGPBUH2/p1684469502484689