Closed rocktavious closed 4 months ago
It is seconds.
@mperham are you sure? We have faktory emitting this data to datadog and are seeing values of 5000 - 6000 which if it were milliseconds would be about 5 seconds. Additionally I just added some logging to our golang worker and it seems to agree with the 5 seconds calculation. Maybe i don't understand the metric correctly but my understanding of perform
is that its "job duration" because thats the time between FETCH
and ACK
Sorry just looking for 100% clarification because we intend to setup an SLO with this since we are performing jobs for customers of ours and have SLAs to uphold.
runner-faktory-785b46877f-pj26n runner-faktory 4:09PM INF Starting job '1463' runner=faktory
runner-faktory-785b46877f-pj26n runner-faktory 4:09PM INF Finished job '1462' took '5.496016395s' and had outcome 'success' outcome=
runner-faktory-785b46877f-pj26n runner-faktory 4:09PM INF Starting job '1464' runner=faktory
runner-faktory-785b46877f-pj26n runner-faktory 4:09PM INF Finished job '1463' took '4.81213557s' and had outcome 'success' outcome=
runner-faktory-785b46877f-pj26n runner-faktory 4:09PM INF Starting job '1465' runner=faktory
runner-faktory-785b46877f-pj26n runner-faktory 4:09PM INF Finished job '1464' took '5.543517213s' and had outcome 'success' outcome=
runner-faktory-785b46877f-pj26n runner-faktory 4:09PM INF Finished job '1465' took '5.508035371s' and had outcome 'success' outcome=
runner-faktory-785b46877f-pj26n runner-faktory 4:10PM INF Starting job '1466' runner=faktory
runner-faktory-785b46877f-pj26n runner-faktory 4:10PM INF Starting job '1467' runner=faktory
runner-faktory-785b46877f-pj26n runner-faktory 4:10PM INF Finished job '1466' took '6.81928801s' and had outcome 'success' outcome=
runner-faktory-785b46877f-pj26n runner-faktory 4:10PM INF Finished job '1467' took '15.665768976s' and had outcome 'success' outcome=
latency = float64(time.Since(tm)) / float64(time.Second)
c.Gauge(fmt.Sprintf("latency.%s", name), latency, nil, 1)
You asked about latency but are talking about perform. Latency is in seconds because it uses c.Gauge directly. perform uses c.Timing, which is in milliseconds.
And yeah, that's.... less than ideal.
Ahh sorry ya was talking about both metrics and wires got crossed.
FWIW - i've updated the wiki to call this out. Up to you if you want to try and fix them to both be seconds.
In reading through the https://github.com/contribsys/faktory/wiki/Ent-Metrics#latency and also the job execution metrics its unclear what the unit of "time" is for the metrics. Is it milliseconds? microseconds? nanoseconds?
I tried finding this in the codebase but I came up empty handed its hard to figure out where the metrics are being produced and searching for
perform
orlatency
didn't yield anything.