Closed sniku closed 3 years ago
I think the metricEmissionHooks
option is unnecessary.
import http from 'k6/http'
export let options = {
metricEmissionHooks: {
http: {
successHook: 'myHttpSuccessHook',
}
}
};
export function myHttpSuccessHook(response){
return response.status >= 200 && response.status <= 204
}
export default function() {
http.get("https://httpbin.test.k6.io/status/200");
}
Instead, we can define a name convention for the hook method. For example,didSuccessRequest
.
import http from 'k6/http'
export function didSuccessRequest(response){
return response.status >= 200 && response.status <= 204
}
export default function() {
http.get("https://httpbin.test.k6.io/status/200");
}
I'm not that fond of metricEmissionHooks
either, but I'm equally skeptic against using naming conventions to define behavior.
I'd much rather see something along the lines of:
export let options = {
hooks: {
http: {
after: 'myHttpSuccessHook' // after might be a bad name, but you get the gist.
}
}
};
(without having to quote the name, if that's even possible, but I have some vague memory of the interop layer requiring it to be a string?)
The issue with these "named exports", is that we really can't guard against errors in a good way.
If the user were to write didSuccesRequest
, it wouldn't work and it would be quite hard to spot. If we put it in the options, at least we'd be able to set up types for it, gently pushing the user towards the right syntax.
I've had issues myself more than once due to the current types being too permissive, not warning when you write threshold
instead of thresholds
for instance.
I am definetely for it not being 1 more magically named one. I was against the handleSummary
just being magically named one instead of configurable.
Also arguably it will be nice if you can configure different ones for different scenarios for example.
(without having to quote the name, if that's even possible, but I have some vague memory of the interop layer requiring it to be a string?)
The "options" ultimately need to end up as a JSON and be transported around as such (and to not be recalculated) which means that w/e you have in it needs to be able to get to JSON and back in a way that we can figure out what you meant. We can try to do "magical" stuff so when you wrie a name of a function we automatically figure it out(no idea how, and probably will be finicky), but then writing just the function definition there won't work, which is strange ... so, yeah - we need it to be a string.
Also arguably it will be nice if you can configure different ones for different scenarios for example
Do we really want to provide this?
If you allow the user to set a callback, they will likely have enough flexibility to change the logic based on the current scenario execution. In this case, it is not compulsory to serialize the string name.
1 more magically named one
What you call a "magically named", is just a documented API for me.
@simskij provided another alternative something like http.default.setSuccessHook(callback)
IMO, this type of dynamic string execution looks not very intuitive and error-prone. cc @legander
I'm quite fond of the tagging approach since it avoids adding additional metrics for each request, which are costly to process as it is. After https://github.com/loadimpact/k6/issues/1321 is implemented this might not be as important, but we should still try to minimize adding new metrics if we can avoid it.
Are we sure that statsd doesn't support tags still? It seems tag support was added last year in https://github.com/statsd/statsd/pull/697, released in v0.9.0. If this was the only output that didn't support them, and if we're moving towards supporting OpenMetrics as a generic output (which does support tags from what I've seen), then it might be worth it to consider the tag approach after all.
The performance penalty of executing the JS hook on each request is certainly a concern. Are we sure we need these checks to be dynamic? If status codes are the only metric for determining this, maybe we can avoid the hooks and allow configuring them with something like:
export let options = {
http: {
success: {
status: ['>=200', '<400'],
}
}
}
Though I'm not sure if this would perform much better.
OTOH if we need the flexibility for custom failure states determined by headers or something else in the response, then I guess we need the hook based approach. I'm ambivalent about the syntax, and we can certainly refine it later, though I'd also like to avoid using naming conventions for it.
A few short comments to start:
Rate
will be strictly superior to Count
, once we have https://github.com/loadimpact/k6/issues/1312, so my vote is for thathttp_req_failures
seems better than http_reqs_failure
http.default.setSuccessHook(callback)
) seems to be the best proposal to me, though we need to evaluate if we can do it on a per-VU level so we can have different hooks for different VUs/scenarios.So, to step back the details and speak in more general terms... This is a big feature that, even if we agree on everything (and I have plenty of objections :sweat_smile: ), is unlikely to land in k6 v0.31.0. But given that this feature completely depends on the hook/callback (https://github.com/loadimpact/k6/issues/1716), and that the hook is actually the more complicated part, I think we should focus solely on that as the MVP for v0.31.0. I am not sure we'll manage to do even that for v0.31.0, but if we reach a consensus, I think for sure it's the place we should focus our energies first.
And I also mean the "viable" part in "MVP" :sweat_smile: If the hook is a bit more generic, everything else here might be completely unnecessary... If we have a generic post-request hook, it can be used for the following three things:
http_req_duration
with failures)failure: true|false
to the metrics, one of the suggested approaches above)http_req_failures
for failed requests, the other suggested approach above)So, it gives us the most flexibility. And for k6 v0.32.0, when we have had a bit of time to carefully consider things, we can simply set the default such callback in Go to something that we think is reasonable. For example, "do not emit http_req_*
metrics for status >= 400 and emit a http_req_failures
Rate
instead". Or anything else. And then encode that in the cloud, if we want to have smarter results.
Example code:
import http from 'k6/http';
import { Rate } from 'k6/metrics'
let requestFailures = new Rate('http_req_failures')
export let options = {
thresholds: {
http_req_failures: ['rate < 0.1'],
},
};
http.setResponseCallback(function (response, metricsObject) {
let requestFailed = response.status >= 399 || response.status === 0;
requestFailures.add(requestFailed);
if (requestFailed) {
metricsObject.doNotEmitMetrics(); // not a final API, just illustrative :D
}
})
export default function () {
http.get("https://httpbin.test.k6.io/status/200");
http.get("https://httpbin.test.k6.io/status/503");
http.get("https://httpbin.test.k6.io/status/503");
http.get("https://httpbin.test.k6.io/status/403");
}
@na--
I share your views! Only thing I'd like to expand on is that I think it would make sense to be able to define both default and per-call callbacks. As in:
import { skipCallback } from 'http'
http.setDefaultResponseCallback(function (response, metricsObject) {
let requestFailed = response.status >= 399 || response.status === 0;
requestFailures.add(requestFailed);
if (requestFailed) {
metricsObject.doNotEmitMetrics(); // not a final API, just illustrative :D
}
})
export default function () {
http.get("https://httpbin.test.k6.io/status/200"); // Uses the one defined above
http.get("https://httpbin.test.k6.io/status/503"); // Uses the one defined above
http.get("https://httpbin.test.k6.io/status/503", { responseCallback: skipCallback }); // Overrides the one above with noop
http.get("https://httpbin.test.k6.io/status/403", { responseCallback: (res) => { /* ... /*} }); // Overrides the one above with something else
}
While switch cases and if's in the default callback sure is an option here, I'd argue that being able to do this per invocation makes sense, given that the URLs and other characteristics of the request might be dynamic, but the invocations themselves likely will be following a pattern you can reason about.
I like the solution @na-- posted, it seems to provide most bang for the buck! (although initially requiring some setup on the user end), having a nice default (which is also brought up) would be nice. Perhaps the naming should indicate more clearly that the responseCallback is being processed before metrics are emitted? Like preResponseCallback
or responseInterceptor
idk. Could also be solved by docs I guess 🤷♂️
:+1: for per-request callbacks. Not sure if having a special skipCallback
thing in the http
module is better than just null
or function() {}
, but those are implementation details we can discuss in https://github.com/loadimpact/k6/issues/1716
:+1: for a better name as well, my opinion here is :+1: for preResponseCallback
, :-1: for responseInterceptor
:smile: another possible suggestion postRequestCallback
or postRequestHook
, with a slight preference for the hook
variant
There are many good things with a generic callback that can manipulate all metrics, but there are bad things here as well:
http_req_failures
because the user can define whatever name they want, and cloud won't know the meaning of it.I think we should have a generic hook as well, but I don't think it's required for this feature. We can have many hooks for many different purposes.
@sniku Isn't it specifically what is being mentioned here or did I misinterpret that?
And for k6 v0.32.0, when we have had a bit of time to carefully consider things, we can simply set the default such callback > in Go to something that we think is reasonable. For example, "do not emit httpreq* metrics for status >= 400 and emit a >http_req_failures Rate instead". Or anything else. And then encode that in the cloud, if we want to have smarter results.
Agree with @sniku . IMO, one of the main advantages of this feature is to define a threshold easily on the error rate.
import http from 'k6/http';
export let options = {
thresholds: {
http_req_failures: ['rate < 0.1'],
// or
'http_reqs{failed:true}': ['rate < 0.1'],
},
};
export default function () {
http.get("https://httpbin.test.k6.io/status/200");
http.get("https://httpbin.test.k6.io/status/503");
http.get("https://httpbin.test.k6.io/status/503");
http.get("https://httpbin.test.k6.io/status/403");
}
Yes, what @legander said :smile: And, I think we've discussed this before, but as an API design principle, if you can choose between:
http_req_failures
and do not emit request duration for failed metrics") that is good enough for the majority of people, but can be changedhttp_req_failures
, but we'll allow users to tweak which requests are considered failure")In my mind, the first is always better, since it delivers the same benefits but covers corner cases and alternative use cases much, much better. A huge percentage of the open k6 issues and existing technical debt are because we've chosen the second option far too often. The whole k6/http
module is one huge example of it, in fact...
For example, the majority of people probably want http_req_failures
, but some might prefer tagged metrics with failed: true
when they are emitting them to an external output that is not our cloud. With my suggestion, both use cases would be covered, whereas with what you propose, only a single one will be and we'll have a bunch of open issues we'll never solve and a bunch of PRs we'll never merge :wink:
@ppcano, please explain how my suggestion precludes easily defining a threshold on the error rate?
import http from 'k6/http';
import { Rate } from 'k6/metrics'
let requestFailures = new Rate('http_req_failures')
export let options = {
thresholds: {
http_req_failures: ['rate < 0.1'],
},
};
http.setResponseCallback(function (response, metricsObject) {
let requestFailed = response.status >= 399 || response.status === 0;
requestFailures.add(requestFailed);
if (requestFailed) {
metricsObject.doNotEmitMetrics(); // not a final API, just illustrative :D
}
})
export default function () {
http.get("https://httpbin.test.k6.io/status/200");
http.get("https://httpbin.test.k6.io/status/503");
http.get("https://httpbin.test.k6.io/status/503");
http.get("https://httpbin.test.k6.io/status/403");
}
@na-- I might have misunderstood your example, but it looks the user needs to create a Rate Metric, implement the hook and track the values.
It is a big difference in comparison with only:
export let options = {
thresholds: {
http_req_failures: ['rate < 0.1'],
},
};
Also, it would be more difficult to set a threshold on the error rate with the Test Builder.
@ppcano , you didn't misunderstand my example, you probably just didn't read the text right above it... and when @legander copy-pasted it...
And for k6 v0.32.0, when we have had a bit of time to carefully consider things, we can simply set the default such callback in Go to something that we think is reasonable. For example, "do not emit
http_req_*
metrics for status >= 400 and emit ahttp_req_failures
Rate
instead". Or anything else. And then encode that in the cloud, if we want to have smarter results.
So, when we set such a default (be it k6 v0.32.0 or v0.31.0 if we manage it), unless the user has some specific requirements that would force them to define a custom post-request hook, setting a threshold on http_req_failures
will look like this with my suggestion:
import http from 'k6/http';
export let options = {
thresholds: {
http_req_failures: ['rate < 0.1'],
},
};
export default function () {
http.get("https://httpbin.test.k6.io/status/200");
http.get("https://httpbin.test.k6.io/status/503");
http.get("https://httpbin.test.k6.io/status/503");
http.get("https://httpbin.test.k6.io/status/403");
}
This is the whole script - the default post-request hook takes care of updating http_req_failures
...
Btw, FWIW, having a default post-request hook also makes the breaking change much smaller. We all agree that a breaking change is necessary - the vast majority of people probably wouldn't want to count failed requests in http_req_duration
and would benefit from http_req_failures
. But for anyone who doesn't, it would be easy to opt out and keep the current k6 behavior by just doing something like this:
import http from 'k6/http';
http.setDefaultPostRequestHook(null);
And I think another good API design principle is that, when you're making a breaking change, you should strive to at least provide an easy solution for people who don't like it and want to go back to how things were... In this specific case, I imagine some people who have adopted their own solution for tracking HTTP errors might make use of it while they transition to our new hook-based approach.
btw I'm starting to dislike postRequestHook
as the name, it might be confused with a hook for POST
requests... :sweat_smile: So, changing my vote, I am not in favor of preResponseCallback
or preResponseHook
as the name.
I was re-reading the issue and I see I didn't address two parts of @sniku argument:
The implementation of a generic hook that can do anything makes it easy for the user to shoot themselves in a foot.
I don't accept the premise here, this can literally be an argument against almost any change. Especially everything k6 does above and beyond tools like wrk
or ab
... :stuck_out_tongue_winking_eye: We can discuss if a pre-response hook is something very unusual (I don't think it is), or if it violates POLA, but as long as the API and defaults are reasonably good, I can't think how giving our users more power and flexibility is bad.
I think we should have a generic hook as well, but I don't think it's required for this feature. We can have many hooks for many different purposes.
I strongly disagree on both counts. Besides thinking that it's a better solution for the specific problem we are discussing here, as I've expressed above, generic callbacks will solve plenty of other issues real k6 users have had. For example, this forum post linked in https://github.com/loadimpact/k6/issues/1716. Or https://github.com/loadimpact/k6/issues/927 / https://github.com/loadimpact/k6/pull/1171, https://github.com/loadimpact/k6/issues/800 (custom callback on the first HTTP digest request that has expected 400 response), https://github.com/loadimpact/k6/issues/884 (if we have http.setDefaultPostRequestHook()
can have easily also have http.getCurrentDefaultPreResponseHook()
and have a way to temporarily wrap the hook in another hook and then restore it), https://github.com/loadimpact/k6/issues/1298, this and this issues from the forum can be also solved by a per-request callback. And I'm probably missing some...
As to the "many hooks for many different purposes" - this will be quite difficult to do technically (please read through some of the objections in https://github.com/loadimpact/k6/issues/1716, specifically how it's already tricky when you have http.batch()
in a single-threaded JS runtime). But I also think it's bad UX...
When writing this issue I explored the generic hook idea as well but I couldn't come up with API that was intuitive and didn't require the user to define/re-define the built-in http_req_failures
metric.
I would really like to see a concrete proposal for the generic hook to solve this specific problem before making up my mind.
Here are some comments on the proposal from @na--
let requestFailures = new Rate('http_req_failures'); // redefinition of the built-in-metric should not be allowed like this (we have an issue for this)
http.setResponseCallback(function (response, metricsObject) {
let requestFailed = response.status >= 399 || response.status === 0;
requestFailures.add(requestFailed); // user responsible for manipulating a built-in metric feels wrong, but acceptable. Tags to this metric are also missing (status, url, etc)
if (requestFailed) {
metricsObject.doNotEmitMetrics(); // not convinced this is a good API :-) We do want to emit _some_ metrics.
}
// tagging of http_reqs is missing. I could not come up with a good suggestion for this.
})
Even if we do come up with a nice implementation for the default hook, the user will have to copy-paste our entire default implementation just to change the status ranges (let's say 403 is acceptable by the user). It's not possible to extend the default implementation, it needs to be fully replaced by the user.
I like the http.setResponseCallback()
API. That makes sense to me.
At the moment, I still think that boolean httpResponseSuccess()
hook is easier for the user to understand and change. I think having more hooks is better UX than having one hook to rule them all, but I'm willing to be convinced by a nice API proposal.
I couldn't come up with API that was intuitive and didn't require the user to define/re-define the built-in
http_req_failures
metric.
I'll create an issue for a metric refactoring proposal, hopefully later today, for a prerequisite we need to implement https://github.com/loadimpact/k6/issues/1435, https://github.com/loadimpact/k6/issues/572, and https://github.com/loadimpact/k6/issues/1443 / https://github.com/loadimpact/k6/issues/961#issuecomment-557482834. In essence, it will make it so that it doesn't matter if the user has let requestFailures = new Rate('http_req_failures')
in their script, or if we've defined it ourselves in the default preResponseCallback
we have written in pure Go.
I would really like to see a concrete proposal for the generic hook to solve this specific problem before making up my mind.
Fair enough. I guess this is the next step for this issue, or maybe we should put this on hold and "go back" to https://github.com/loadimpact/k6/issues/1716 to discuss it.
Even if we do come up with a nice implementation for the default hook, the user will have to copy-paste our entire default implementation just to change the status ranges (let's say 403 is acceptable by the user). It's not possible to extend the default implementation, it needs to be fully replaced by the user.
I am not sure this is the case. It depends on how we design the API, of course, but I think it might be possible to make it reasonably composable, we'll see. And even if it's not possible, the default implementation I suggested above would be something like these whooping... 6 lines of code :smile:
let requestFailures = new Rate('http_req_failures')
http.setResponseCallback(function (response, metricsObject) {
let requestFailed = response.status !== 403 && (response.status >= 399 || response.status === 0);
requestFailures.add(requestFailed);
if (requestFailed) {
metricsObject.doNotEmitMetrics(); // not a final API, just illustrative :D
}
})
Having to copy-paste them is not perfect, but it's reasonably compact and understandable and, most importantly, it lacks any magic at a distance... And we can easily write a jslib wrapper that just accepts the requestFailed
condition as a lambda and deals with everything else, for people who want one-liners...
Sorry, @sniku, I completely missed the code comments you added. To respond:
redefinition of the built-in-metric should not be allowed like this (we have an issue for this)
I already discussed some of the current problems with metrics in that internal discussion we have, but I will elaborate more in the new k6 issue I plan to write today or on Monday about the metric refactoring that's a prerequisite for https://github.com/loadimpact/k6/issues/1435, https://github.com/loadimpact/k6/issues/572, and https://github.com/loadimpact/k6/issues/1443 / https://github.com/loadimpact/k6/issues/961#issuecomment-557482834. In summary, "creating" a new metric with the same name as a built-in system metric (e.g. new Rate('http_req_failures')
) will just return the same metric (i.e. internally, a Go pointer to it) if the type matches, and throw an exception if it doesn't. https://github.com/loadimpact/k6/issues/1435 is the issue you probably meant.
user responsible for manipulating a built-in metric feels wrong, but acceptable. Tags to this metric are also missing (status, url, etc)
Also related to above, but the distinction between built-in and custom metric is quite artificial. And the API can be like this, Rate.add()
accepts tags:
requestFailures.add(requestFailed, metricsObject.getTags());
not convinced this is a good API :-) We do want to emit some metrics. tagging of http_reqs is missing. I could not come up with a good suggestion for this.
I said it was illustrative, and I agree, it's not a good API. But what metrics exactly do you think we should emit? http_reqs
?
In summary, "creating" a new metric with the same name as a built-in system metric (e.g. new Rate('http_req_failures')) will just return the same metric (i.e. internally, a Go pointer to it) if the type matches, and throw an exception if it doesn't. #1435 is the issue you probably meant.
If we must allow users to manipulate built-in metrics, wouldn't it be better to allow users to import them rather than redefine them?
import {http_req_duration} from 'k6/metrics/built-in';
http_req_duration.add(1);
Also related to above, but the distinction between built-in and custom metric is quite artificial.
I have a different opinion. Built-in metrics have some guarantees:
contains
parameter.k6 cloud
depends on the first guarantee to work correctly.
I said it was illustrative, and I agree, it's not a good API. But what metrics exactly do you think we should emit? http_reqs?
I understand that it was a quick/initial mockup, but I really think we should get a reasonably final API proposal before deciding how to go forward. I sincerely doubt that the final implementation will be a "whooping... 6 lines of code".
metricsObject
is structured. Does it hold all built-in metrics for the request? (http_reqs, http_req_duration, http_req_blocked, ...) k6 cloud
get http_req_li_all
inside of metricsObject
while k6 run
gets unagreggated metrics?http_req_failures
inside of metricsObject
but must be outside?http_reqs
metric with a standard failed
tag?http_req_duration+friends
from being emitted, but allow http_reqs
to be emitted?The fact that we put the burden of all this on the user doesn't feel like good UX to me, but again, I would like to see a good API proposal before I agree/disagree :-)
I like the import of metrics, though that's not going to solve the issue of add()
-ing metrics without all of the required tags. In that respect, import {http_req_duration} from 'k6/metrics/built-in'
is exactly the same as new Trend('http_req_duration')
returning the built-in metric, and the contains
parameter won't be over-written in any case. To solve this issue, we probably need to expand https://github.com/loadimpact/k6/issues/1435 and validate that the required tags for the metrics are present.
I understand that it was a quick/initial mockup, but I really think we should get a reasonably final API proposal before deciding how to go forward.
Me too :smile: That was the whole reason for me pushing to define this task before we prioritize it... I guess now it's my responsibility to come up with this API, given that I dislike the original proposal and have suggested an alternative. I'll try to think about it some more and propose something over the next week, but here are some very preliminary answers to the questions you posed:
metricsObject
as a second parameter; instead a parameter that's an object containing the response
(which has the actual timings
), something that controls what metrics will be emitted, as well as the system tags for the metrics.k6 cloud
and k6 run
, the callback will get called with exactly the same data; however, because of our hack with http_req_li_all
for k6 run -o cloud
, we have to figure out how to accommodate httpext.Trail
and aggregation in a non-breaking way (:heart: the tech debt :smile:) http_req_failures
value already exist when we're in the callback to determine it :smile: Anyway, I'll let this percolate somewhat and hopefully get back to you sometime next week.
So I'm just thinking a bit of how to maintain a JavaScript feel to it, and my suggestion might be way of. But here goes.
As for the interface for manipulating the response event (or what you'd call it) I would go for something like the following:
const redirect = new Counter("redirect")
http.on("response", (ev) => {
console.log("default metric: ", ev.metric.name); // whatever k6 thinks it should be
if (ev.request.url === "https://test.k6.io") {
// You go ahead and handle this for me!
return
}
const status = ev.response.status
if (status >= 400) {
// Change ev.metric to http_reqs_failure
ev.fail()
} else if (status >= 300) {
redirect.add(1)
ev.preventDefault() // tell k6 to ignore this ev
} else {
// Change ev.metric to http_reqs_success
ev.succeed()
}
})
The above just makes sense to me as a JavaScript-developer. It matches how NodeJS and browsers handle these kinds of events. For instance, in the browser anchor.addEventListener("click", ev => ev.preventDefault())
tells the browser to ignore handling the click (the on
/off
naming is node-like and much nicer than `addEventListener).
We could also have specific events for success
/failure
in case you know that you only want to modify failure into success or vice versa. Could improve performance a bit, by not being a catch-all.
// An evil API that has reversed error with success status
http.on("failure", ev => {
ev.succeed()
})
http.on("success", ev => {
ev.fail()
})
The list goes on what would be possible:
http.on("request", ev => {
// I'm lazy and don't want to add this everywhere
ev.request.headers.add("X-Request-Id", uuid())
// Add tags for every request
ev.addTags({
})
})
I could go on and on. 😬
look at that, @allansson coming into the thread killing it with good suggestions! ;)
I agree with my proposal :). Thanks @na-- for bringing it up instead of I having to do it.
As for a compromise so that we don't have to have any JS callbacks for the first version (which likely will still not be in v0.31.0)
I propose we have an additional built-in callback that is exteremely "special" object that can't be used for anything else (at least for now). But you can register it as the "callback" of http.setResponseCallback
or w/e we will end up calling it, as well as per request. This callback will be able to be used even after we start supporting JS ones and will be more performant. I actually meant to propose we have some built-in ones like this so that we don't impact performance as much for common cases, but now it seems like it will be good for other things as well ;).
My proposals for the one that should be included is:
http.setResponseCallback(http.expectedStatuses([200,299], [301,302], 308, 418);
The arguments are either an array of 2 ints or an int. If an int they are just the status code we expect, if they are an array they are the inclusive start and end of a range of codes we expect. This one callback will be easy to implement, is likely going to cover 95% of all use cases ever(famouse last words), does not stop us from having JS callbacks later, will mean that the performance for this simple one will likely be quite close to what it would be if we just hardcode it.
I would say that this way we at least won't need to figure out what the arguments to the JS callback need to be now. While still not leaving us with more technical debt, except having to support the built-in one, which I would've argued in the first place.
When we have JS callbacks the code will just check whether it gets an built-in callback or a not built in one. Also possibly the built-in ones will be able to be called with the same arguments as the JS ones and do what they usually do, possibly being used inside a JS callback - not really certain if that will really be all that useful, as at least for performance - calling in JS -> GO -> JS is probably going to be more work then just the if
s that will be saved :shrug:.
:+1: for different API @allansson , we actually internally discussed that this will likely be useful for before doing requests, especially in redirect chains. So it is likely that we will need to either have to have multiple callback regsiters, or have an 'on' one.
I am somewhat of the opinion that it will be better if this is with a tag "failed" then with a new metric, and then omitting the failed request metrics, as this way you won't know the performance of your failing requests(or you won't be able to mark them as failed). I think that it is fairly importatnt to know that your failed request also don't take forever and will is likely to be helpful when finding out what is going on when they are failing to know that they are failing in 10 seconds instead of 1 or that the tls handshake took 20 seconds.
I would also again like to bring everybodies attention to the fact that k6 supports not only HTTP, so when we start moving this to other protocols, and add protocols, adding additional metrics or having HTTP specific names is probably not going to be ... great.
To actually address @sniku concerns with this:
some outputs don't ingest tags. It would not be possible to use this functionality with statsd some "special case" handling would be required for displaying.|
This is also true for having more then 1 URL being hit by the script, as the URL is a tag again. So I am of the opinion that if tags aren't supported by an output, that output should probably not be recommended/supported.
http_reqs would be backwards incompatible unless we combine rate and counter into a new metric type.
http_reqs
is still a Counter, it is just that it now has a tag "failed" and we have to have a way to print the number/percentage of failed/not failed in the terminal, but that doesn't make it a Rate
. k6 will just need to make a Rate like thing/metric out of it when displaying
if they are an array they are the inclusive start and end of a range of codes we expect.
I'm strongly against this. Conditional logic in argument usage is... weird and reaaaally error-prone. Especially if you don't label it. If you really want to be able to mix entries with min-max style ranges, I'd definitely vote for having a separate object you could pass for that, with min and max as properties, like:
http.setResponseCallback(
http.expectedStatuses(
{min: 200, max: 299},
308,
/* ... */
));
If it takes an array as well, that array should be considered a list of single status codes.
Other than that, I think having an optimized, predefined status comparator makes perfect sense.
I like almost all the ideas @MStoykov brought up here :+1: :+1: :+1:
:heart: for being the voice of reason and cutting down the scope while not compromising on future extendability.
I'm also for the slight API modification proposed by @simskij
http.setResponseCallback(http.expectedStatuses([{min: 200, max: 299}, {min: 301, max: 302}, 308, 418]);
It makes it less error-prone.
As far as I can tell, this proposal solves several issues:
I also very much like the callback API proposed by @allansson :heart: but I think that proposal should be copy-pasted into https://github.com/loadimpact/k6/issues/1716 rather than be a part of this issue.
@allansson, you've given a lot of good suggestions, though they are things we probably won't be able to incorporate until we've started working on the new version of the k6 HTTP API, whenever that is. And we'd probably need event loops for most of them (https://github.com/loadimpact/k6/issues/882)... :disappointed: And we should consider the performance impact of multiple different callback types, especially if they stack, since that might significantly affect something like http.batch()
. In any case, it's a bit too early for such a discussion.
Also a heavy :+1: from me for @MStoykov's suggestion with @simskij's modifications :tada: So, it seems like we have a consensus on this part :tada: :confetti_ball: Now only this bit remains:
I am somewhat of the opinion that it will be better if this is with a tag "failed" then with a new metric, and then omitting the failed request metrics, as this way you won't know the performance of your failing requests(or you won't be able to mark them as failed). I think that it is fairly importatnt to know that your failed request also don't take forever and will is likely to be helpful when finding out what is going on when they are failing to know that they are failing in 10 seconds instead of 1 or that the tls handshake took 20 seconds.
@MStoykov, I partially agree with you here. 60s timeouts and 10ms 503
errors would both be considered failed, but are completely different beasts. Hiding both behind a http_req_failures
Rate
metric would be terrible - actually worse than the current situation.
That said, if we use a failed
tag on the normal HTTP metrics, we won't be able to specify thresholds like "failed requests should be less than 1%". And unless we implement https://github.com/loadimpact/k6/issues/1321 first (or define bogus thresholds to expose the sub-metrics, like these workarounds: https://github.com/loadimpact/k6-docs/issues/205), we won't be able to see this breakdown in the end-of-test summary... So, even if we both have failed:true/false
as a metric tag on the old http_req_*
metrics AND we have a new http_req_failures
Rate
metric, we'd still have a shitty and insufficient UX... :disappointed:
An internal discussion later, the consensus is that there will be both a tag and a metric:
http_req_duration{failed:false}
is a double negative in order to get the duration of successful request.In the (far) future when k6 has support for disabling metrics and having thresholds supporting something like http_reqs{passed:true}.count/http_reqs.count>0.99
, we will probably disable http_req_failed
in order to reduce the noisiness of metrics that k6 emit while still having a way to make a threshold on the rate of the "failing" requests.
As per discussion with @sniku we agreed upon the output by default being changed to look like:
data_received..........................: 68 MB 6.4 MB/s
data_sent..............................: 173 kB 16 kB/s
http_req_blocked{passed:true}..........: avg=19.64ms min=281ns med=1.19µs max=274.96ms p(90)=2.17µs p(95)=267.61ms
http_req_connecting{passed:true}.......: avg=4.46ms min=0s med=0s max=76.85ms p(90)=0s p(95)=43.37ms
http_req_duration{passed:true}.........: avg=416.69ms min=77.59ms med=357.13ms max=890.05ms p(90)=706.43ms p(95)=753.02ms
http_req_receiving{passed:true}........: avg=232.02ms min=8.41ms med=172.71ms max=693.12ms p(90)=546.14ms p(95)=571.89ms
http_req_sending{passed:true}..........: avg=98.18µs min=21.65µs med=78.07µs max=1.68ms p(90)=124.79µs p(95)=143.14µs
http_req_tls_handshaking{passed:true}..: avg=10.57ms min=0s med=0s max=162.9ms p(90)=0s p(95)=134.05ms
http_req_waiting{passed:true}..........: avg=184.56ms min=43.51ms med=170.59ms max=446.44ms p(90)=262.49ms p(95)=289.31ms
http_reqs..............................: 275 26.133352/s
{ passed:true }.......................: 175 16.133352/s
{ passed:false }......................: 100 10.000000/s
iteration_duration.....................: avg=2.27s min=1.54s med=2.31s max=2.94s p(90)=2.57s p(95)=2.68s
iterations.............................: 83 7.887521/s
vus....................................: 20 min=20 max=20
vus_max................................: 20 min=20 max=20
which is basically the expanded version of Tag http_req_duration with failed:true|false tag and display filtered values.
of the original comment
:-1:, or rather, :-1: for k6 v0.31.0, :+1: for first implementing https://github.com/loadimpact/k6/issues/1832 and https://github.com/loadimpact/k6/issues/1321, on which this change depends on, and showing a version of that as the default summary in future k6 versions (e.g. v0.32.0)
I think that :
I agree that this is a better experience. I am not certain that we shouldn't as well have an easy way to show the time for the failing ones as well ... but arguably this is just one more summaryHandle
helper in jslib.
If we agree on 3. (and because summaryHandle
can't tell that it needs http_req_blocked{passed:true}
for example). Then the summary definitely needs a way to get all the http_req*{passed:true}
and:
for _, name := range []string{
"http_reqs{passed:true}",
"http_req_duration{passed:true}",
"http_req_blocked{passed:true}",
"http_req_connecting{passed:true}",
"http_req_tls_handshaking{passed:true}",
"http_req_sending{passed:true}",
"http_req_waiting{passed:true}",
"http_req_receiving{passed:true}",
} {
if _, ok := e.thresholds[name]; ok {
continue
}
parent, sm := stats.NewSubmetric(name)
e.submetrics[parent] = append(e.submetrics[parent], sm)
}
after https://github.com/loadimpact/k6/blob/11811171a3c416daf17f7f0ad182ca30aa9990bb/core/engine.go#L100-L108 seems to do the trick, and arguably it will be something kind of similar with #1321 .
Arguably: Doesn't look that terrible, but I would argue that makes #1319 more relevant ;), so I am not for just adding the above code and leaving it.
To me, the biggest problem (if we agree on 3.) seems to be #1806, which either needs to be done now or will need to redo the changes when we get to it, although arguably the changes will not be all that complicated so :man_shrugging:
Let's backtrack a little bit. I haven't looked at the code in https://github.com/loadimpact/k6/pull/1856 yet, but it seems like it fulfills everything we all previously agreed on (https://github.com/loadimpact/k6/issues/1828#issuecomment-772581756), right? If so, given that there is a very easy workaround to surface specific sub-metrics in the end-of-test summary (https://github.com/k6io/docs/issues/205), I think the PR covers way more than 80% of the goals for this issue for less than 20% of the effort:
http_req_duration{passed:true}
values in the summaryAgain, this seems plenty good enough for such a contentious set of features, for now, i.e. for k6 v0.31.0... I am for continuing with things in this priority order: https://github.com/loadimpact/k6/issues/1832, followed by https://github.com/loadimpact/k6/issues/1806, followed by https://github.com/loadimpact/k6/issues/1321, followed by changing the summary. And this changing of the summary should include showing http_req_duration{passed:true}
, but also hiding most of the other http_req_*
metrics except a new composite one that's the sum of http_req_connecting + http_req_tls_handshaking
, as we've previously discussed.
If we decide that we're not going to release any of these things in k6 v0.31.0, but we start working on them now in preparation for k6 v0.32.0, I think there's plenty of time to get them done until the end of April...
I am mostly against hardcoding a bunch of cruft in the Engine
, just so we can get the remaining 5% of this feature that users can currently get with a small workaround (and which we plan to fix as a part of https://github.com/loadimpact/k6/issues/1321). And even if I am outvoted for that, I am definitely and very firmly against adding a {passed:true}
submetric by default for every http_req_*
metric that exists... I don't see a point in showing submetrics for metrics we plan to soon hide by default... So, anything beyond the single http_req_duration{passed:true}
would take a lot of convincing for someone to "sell" it to me.
Remember, besides the needless clutter in the end-of-test summary, until we have https://github.com/loadimpact/k6/issues/763, every Trend
sub-metric is costing us quite a lot of memory and CPU for long-running or metric-heavy tests. We are currently storing in memory all of the values for every Trend
metric, for the whole duration of the test, and sorting them every few seconds, just to be able to calculate the percentiles for the summary and threshold calculations... So, I am very firmly against basically doubling this built-in memory leak and CPU burden...
As @na-- says, the #1856 PR goes a long way towards fulfilling this feature request. Still, even if #1856 is merged, the second basic requirement of this feature is not fulfilled.
- show response time for successful requests. (without adding boilerplate code)
Not getting http_req_duration
for successful requests without adding boilerplate code is difficult to accept. I understand that it's possible to define a dummy threshold to get the duration in the output, but that's not in line with providing good user experience by default.
Again, this seems plenty good enough for such a contentious set of features, for now, i.e. for k6 v0.31.0... I am for continuing with things in this priority order: #1832, followed by #1806, followed by #1321, followed by changing the summary.
If I'm reading this sentence correctly, you are proposing to skip showing http_req_duration
for successful requests until all 3 of these issues are implemented?
So, I am very firmly against basically doubling this built-in memory leak and CPU burden...
I think we all firmly agree on this. We don't want to double CPU and memory burden.
While I'm personally not against the "clutter" in see output, I see your point. I agree that adding {passed:true}
to all minor http_req_*
metrics is unnecessary, especially if it considerably increases the CPU/memory consumption.
I'm suggesting to display tagged values only for http_req_duration
and possibly http_reqs
. The latter is less important to me.
The proposed output would look similar to this:
data_received.............: 68 MB 6.4 MB/s
data_sent.................: 173 kB 16 kB/s
http_req_blocked..........: avg=19.64ms min=281ns med=1.19µs max=274.96ms p(90)=2.17µs p(95)=267.61ms
http_req_connecting.......: avg=4.46ms min=0s med=0s max=76.85ms p(90)=0s p(95)=43.37ms
http_req_duration.........: avg=416.69ms min=77.59ms med=357.13ms max=890.05ms p(90)=706.43ms p(95)=753.02ms
{passed:true}...........: avg=216.69ms min=77.59ms med=357.13ms max=890.05ms p(90)=706.43ms p(95)=753.02ms
http_req_receiving........: avg=232.02ms min=8.41ms med=172.71ms max=693.12ms p(90)=546.14ms p(95)=571.89ms
http_req_sending..........: avg=98.18µs min=21.65µs med=78.07µs max=1.68ms p(90)=124.79µs p(95)=143.14µs
http_req_tls_handshaking..: avg=10.57ms min=0s med=0s max=162.9ms p(90)=0s p(95)=134.05ms
http_req_waiting..........: avg=184.56ms min=43.51ms med=170.59ms max=446.44ms p(90)=262.49ms p(95)=289.31ms
http_reqs.................: 275 26.133352/s
{ passed:true }..........: 175 16.133352/s
{ passed:false }.........: 100 10.000000/s
http_reqs_failed..........: 33.33% ✓ 1 ✗ 2
iteration_duration........: avg=2.27s min=1.54s med=2.31s max=2.94s p(90)=2.57s p(95)=2.68s
iterations................: 83 7.887521/s
vus.......................: 20 min=20 max=20
vus_max...................: 20 min=20 max=20
If I'm reading this sentence correctly, you are proposing to skip showing http_req_duration for successful requests until all 3 of these issues are implemented?
Yes, that was my proposal. And I think it's realistic to target k6 v0.32.0 for all of that, if we put these 4 issues on the priority track and start working on them now.
The rationale is that if people have a threshold on http_req_duration{passed:true}
, which is a good idea in general, they will see it ins stats the summary in k6 v0.31.0. And if they don't have such a threshold, they can easily add a dummy one and see it, until we add it by default. Another reason for waiting until we have implemented the metric whitelisting is that if we don't, users would see (and k6 would calculate) the stats for both http_req_duration
and http_req_duration{passed:true}
, and the first one is a bit useless.
All of that said, I am not very opposed to a temporary hack to show http_req_duration{passed:true}
if we don't touch any of the other metrics...
I'm suggesting to display tagged values only for
http_req_duration
and possiblyhttp_reqs
. The latter is less important to me.
Why http_reqs
though? You literally have that information right below in your example, via http_reqs_failed
:
http_reqs_failed..........: 33.33% ✓ 1 ✗ 2
I think the end state of this effort, after #1832, #1806, #1321, and the summary changes have been implemented, should look like this:
data_received..........................: 68 MB 6.4 MB/s
data_sent..............................: 173 kB 16 kB/s
http_req_conn_duration{passed:true}....: avg=4.46ms min=0s med=0s max=76.85ms p(90)=0s p(95)=43.37ms count=900
http_req_conn_duration{passed:false}...: avg=9.12s min=0s med=0s max=60.00s p(90)=50s p(95)=55.55s count=100
http_req_duration{passed:true}.........: avg=416.69ms min=77.59ms med=357.13ms max=890.05ms p(90)=706.43ms p(95)=753.02ms count=900
http_reqs_failed.......................: 10.00% ✓ 900 ✗ 100
iteration_duration.....................: avg=2.27s min=1.54s med=2.31s max=2.94s p(90)=2.57s p(95)=2.68s count=2000
Where http_req_conn_duration
is a new composite the sum of http_req_connecting
, http_req_tls_handshaking
(and probably also DNS lookup, since it should be relatively easy to add that back in now). Also notice the count
column now shown by default for Trend
metrics.
We had a meeting and the consensus is that we'll go with a small hack for showing http_req_duration{passed:true}
by default for k6 v0.31.0, but not mess with the other metrics, like this:
We are still evaluating the actual names for the passed:true/false
metric tag and the http_req_failed
metric, so if anyone has better alternative suggestion, feel free to share.
Also, we should remember that HTTP digest and NTML requests expect 401
responses, so they shouldn't be treated as failures.
Also, we should remember that HTTP digest and NTML requests expect 401 responses, so they shouldn't be treated as failures.
Do you mean that the first 401 should always be treated as a success for digest/ntlm requests? Effectively bypassing the callback?
I haven't looked at the PR code yet, so I'm not sure what you mean by "effectively bypassing the callback", but yes, the first HTTP request in a digest/NTLM auth scenario should expect a 401 response and not treat that as a failure.
This is out of scope for this feature request, but we will likely need to add grpc_req_duration{ passed: true }
and grpc_req_failure
metric to be consistent.
I don't think we should risk delaying this feature by extending the scope, but it's worth reflecting on how the failure metric will be implemented for other protocols.
Or we could stop adding extra metrics for information we already have encoded as tags of existing metrics and fix the fundamental issues that prevent us from using them as intended? :roll_eyes: Adding an extra tag to metrics is mostly fine (whether that's passed
or expected
and whether we can do it as easily with gRPC streaming remains to be seen), but adding new metrics for every possible protocol to paper over some of the deficiencies of the current thresholds is unsustainable... So yes, I foresee us having another of these "fun" discussions in the future... :sob:
Some implementation notes and changes from https://github.com/loadimpact/k6/issues/1828#issuecomment-772581756:
passed
got renamed to expected_response
http_req_duration{ expected_response:true }
http.setResponseCallback
can take null
which stops the emitting of the metric or the tag, effectively returning to the old behavior pre v0.31.0digest
request always expecting to get 401
as their first response, before the authenticate ntlm
request being okay with the first request they make (if redirects) to be a 401 on top of what their responseCallback
expectsShouldn't we close this issue now? 90% of the original issue is implemented, and the rest is hidden behind so much noise that a fresh issue would be a better place for it...
The NTLM handshake actually consists of two HTTP 401s before a HTTP 200 is received (see https://nav7neeet.medium.com/maintaining-session-in-ntlm-authentication-3e8c604d87e9).
I agree with @na-- and will take the liberty of closing his issue. If anyone wants to continue any part of the conversation, please create a new issue that is (preferably) narrowly focused on that specific part.
(Also locking it to prevent additional input to get lost in the void and to promote the opening of new issues)
I'm opening a new issue instead of building on top of https://github.com/loadimpact/k6/issues/1311 to start the discussion with a clean slate. I believe this proposal addresses most if not all issues brought up in the original proposal.
Before discussing the implementation details and effort required to build this feature, let's discuss why we want this feature at all.
Basic requirements
This basic script must:
Discussion about the metric type for
http_reqs_failure
There are two possible metric types for the new
http_reqs_failure
. Rate or Counter. Both types have their advantages, and it's not entirely clear which one is better for this use case.Advantages of Rate:
http_reqs_failure: rate<0.1
Advantages of Counter:
http_reqs
metric.The end-of-test summary for this test should look similar to this:
Output when using Counter metric
Output when using Rate metric
Neither
Rate
norCounter
covers all possible use-cases. I thinkRate
is preferable overCounter
.If we added
count
to theRate
metric, the output could possibly look similar to thisNote, I'm not really advocating for this, just pointing out that neither Rate nor Counter cover all use cases.
Why do we have failures and successes as separate metrics?
The obvious critique of the above suggestion is to say that
http_reqs_success
is unnecessary because it's opposite ofhttp_reqs_failure
. This is true, but some outputs don't allow to define logic, and therefore it's not possible to showhttp_reqs_success
unless k6 itself produces it.Once the metric filtering feature is developed, I would suggest we exclude
http_reqs_success
by default.http_req_duration and other httpreq* metrics.
The core requirement of this feature is to be able to see http_req_duration for successful requests only.
There are two possibilities here:
http_req_duration
for failureshttp_req_duration
withfailed:true|false
tag and display filtered values.Let's discuss both approaches
Don't emit http Trend metrics for failed requests
In this approach,
http_req_duration
and other http metrics won't include failed requests towards the metric's internal state.Users who want to track error timings can define custom metrics like this:
Tag
http_req_duration
withfailed:true|false
tag and display filtered values.With this approach, we would emit the
http_req_duration
and friends as we used to, but we will tag values withfailed:true|false
The default-end-of-test summary would display only successful requests like this:
The most problematic issue with this approach is that some outputs don't ingest tags and won't be able to display http_req_duration for successful requests only.
Examples of
http_req_duration
andhttp_reqs_failure
k6 should produce with this approach.Cloud support
There are additional considerations for the
k6 cloud
support.Performance insights
The "performance insights" feature and web app currently assume that successful requests have status 200-399.
Both approaches listed above solve these problems, although in different ways.
In approach 1, we would only get timings for successful requests and therefore we won't show timings for failed requests. We will still get tagged
http_reqs_failure
metrics and therefore will be able to show errors without timings. We would probably redesign this UI to separate failures from successes in a better way.In approach 2, we would get a new standard tag called
failed
to allhttp_req_*
metrics, includinghttp_req_li_all
. Timings would still be shown for errors (although probably not useful), but the background of the row would be determined by thefailed
tag.(alternative) Why don't we skip the new metrics and purely rely on
failed
tag?It's possible to extend the existing
http_reqs
counter metric by tagging requests withfailed
and changing the metric type toRate
. If that's done, the following script would be possible,Possible end-of-test summary:
Possible problems with this approach are:
I'm (currently) against this alternative.
Defining failure
To determine if a request has failed or succeeded, a JavaScript hook function is invoked after the request, but before the metrics emission. This proposal builds on https://github.com/loadimpact/k6/issues/1716
per-request handling
Sometimes users need to handle special cases.
Alternative 1 - handle inside the hook
Alternative 2 - override the hook per request
What about the redirect chains?
This is up for discussion.
Should the hook fire on every request in the chain or only on the last one?
Concerns
http_req_li_all
and otherhttp_req_*
metrics.