Open Bessonov opened 3 years ago
You could already do most of this with a Counter
, either a built in one (like http_reqs
) or a custom one. For example:
import http from 'k6/http'
export let options = {
scenarios: {
test_counters: {
executor: 'constant-arrival-rate',
rate: 5,
duration: '30s',
preAllocatedVUs: 150,
},
},
thresholds: {
// Hack to surface these sub-metrics (https://github.com/k6io/docs/issues/205)
'http_req_duration{my_tag:test1}': ['max>=0'],
'http_req_duration{my_tag:test2}': ['max>=0'],
'http_req_duration{my_tag:test3}': ['max>=0'],
'http_reqs{my_tag:test1}': ['count>=0'],
'http_reqs{my_tag:test2}': ['count>=0'],
'http_reqs{my_tag:test3}': ['count>=0'],
},
}
export default function () {
http.get('https://httpbin.test.k6.io/delay/1', { tags: { my_tag: 'test1' } });
http.get('https://httpbin.test.k6.io/delay/1', { tags: { my_tag: 'test1' } });
http.get('https://httpbin.test.k6.io/delay/2', { tags: { my_tag: 'test2' } });
http.get('https://httpbin.test.k6.io/delay/1', { tags: { my_tag: 'test3' } });
}
will produce something like this:
http_req_duration..............: avg=1.38s min=1.12s med=1.13s max=2.14s p(90)=2.12s p(95)=2.13s
{ expected_response:true }...: avg=1.38s min=1.12s med=1.13s max=2.14s p(90)=2.12s p(95)=2.13s
✓ { my_tag:test1 }.............: avg=1.13s min=1.12s med=1.12s max=1.54s p(90)=1.13s p(95)=1.13s
✓ { my_tag:test2 }.............: avg=2.12s min=2.12s med=2.12s max=2.14s p(90)=2.13s p(95)=2.13s
✓ { my_tag:test3 }.............: avg=1.13s min=1.12s med=1.12s max=1.37s p(90)=1.13s p(95)=1.15s
...
http_reqs......................: 600 16.768966/s
✓ { my_tag:test1 }.............: 300 8.384483/s
✓ { my_tag:test2 }.............: 150 4.192241/s
✓ { my_tag:test3 }.............: 150 4.192241/s
iteration_duration.............: avg=5.92s min=5.84s med=5.9s max=7.11s p(90)=5.97s p(95)=6.04s
iterations.....................: 150 4.192241/s
But notice that test3
has the exact same rate as test2
, despite it taking just a single second instead of two. This is because of the synchronous nature of the HTTP requests in a single iteration, the test2
is blocking test3
. And that is when we're using an arrival-rate executor, with some looping executor things will be even worse...
So I'm not sure if this will give you a lot of valuable information :man_shrugging: And I have no idea how you'd even calculate something like p(99.9)=3/s
, as you suggest - I'm not sure how we can calculate the percentile of a counter?
@na-- thank you very much for your quick response and explanation.
But notice that test3 has the exact same rate as test2, despite it taking just a single second instead of two.
Yeah, that is what I meant with "not meaningful data".
This is because of the synchronous nature of the HTTP requests in a single iteration, the test2 is blocking test3.
That means, that http_reqs
is tied to the whole iteration and not to a sum of single requests. A little bit surprising...
And that is when we're using an arrival-rate executor, with some looping executor things will be even worse.
... I'm not sure I got it. In fact, I think, I don't understand how k6 works at all :open_mouth: But well, I'll try. Maybe I'm a little bit naive, but with decoupling the start of new VU iterations from the iteration duration
it shouldn't work in this way. I mean even if test2 blocks test3, the processing rate can be different between them. For example, if "1. open register form" can process 20rps on average and "2. register user" only 7rps, then it should be visible if there are enough VUs. But yeah, I see that in a user flow it makes less sense. Probably it's only accurate, if every http.*
is handled like a barrier and only if all VUs pass the barrier, all VUs go to the next barrier. That is the same, if ab
used 5 times in my example or 4 times in your example.
I'm not sure how we can calculate the percentile of a counter?
In my view, rps isn't exactly a counter, but a 1 second window. Maybe again very naive, but I can think of two models. For every http.*
:
In both models I see a bunch of numbers, which represents rps on different points of time. For example, If I got 10 rps numbers like 1, 2, 3, 4, 4, 4, 4, 5, 6, 7 from the first model, then p(90)=2/s
, which can mean that 90% of time the system was able to handle at least 2 rps. And avg = 4/s
, max=1/s
, min=7/s
.
What do you think about it? Does it make sense at all or I'm on wrong path? Maybe I'm damaged by ab
:D
I'll start backwards :sweat_smile:
In my view, rps isn't exactly a counter, but a 1 second window.
This is the crux. k6 itself doesn't operate in 1-second time windows. The "time window" used in the end-of-test summary when k6 calculates the requests per second is... the whole test run duration :sweat_smile: So, when you see <count>/s
in it the summary, those are calculated as <total (e.g. request) count for the whole test run> / <duration of the whole test run>
.
When users need something else (e.g. bucketing request counts in 1-second time windows or charting things over time), k6 allows you to export the raw metrics it measured to an external output like InfluxDB or a JSON/CSV file. You'd get the timestamped measurements for every HTTP request, iteration, custom metric, etc. in your test and you can process them however you like.
I've briefly touched on time windows for metrics internally in k6 when it comes to setting thresholds in https://github.com/k6io/k6/issues/1136, but I can see how it also makes some sense to support them for the end-of-test summary. Mind you, there would still be some gotchas if you want to measure requests per second in 1s intervals, but I can see why it might be useful in some cases.
That means, that
http_reqs
is tied to the whole iteration and not to a sum of single requests. A little bit surprising...Probably it's only accurate, if every
http.*
is handled like a barrier and only if all VUs pass the barrier, all VUs go to the next barrier. That is the same, ifab
used 5 times in my example or 4 times in your example.
Neither of these statements is true :slightly_smiling_face: Let me backtrack a little bit... The 3 important concepts in k6 you need to know are VUs (virtual users), iterations and executors:
constant-vus
will initialize a certain number of vus
, and then every VU will loop over the exec
function over and over until the specified duration
has passed - as soon as one iteration in a VU is done, the executor will start a new iteration in the same VU. On the other hand, the constant-arrival-rate
executor will try to start new iterations at the specified rate
per second (or, rather, per timeUnit
), and if the iterations finish quickly, some of the initialized VUs might sit there idle, basically workers without work... Additionally, because k6 doesn't have per-VU event loops yet (https://github.com/k6io/k6/issues/882), most of the JS code in an iteration is executed synchronously. For example, if you have:
export default function () {
http.get("https://test-api.k6.io/");
http.get("https://test.k6.io/");
sleep(0.1);
};
Every VU executing this iteration will first make a request to https://test-api.k6.io/
, fully wait for it to finish and read its response, and make a request to https://test.k6.io/
after that, wait for that to finish, sleep for 100ms, and only then finish the iteration. All of the VUs will run this code in parallel to each other, according to the executor that is configured, but within a single iteration running in a single VU, things happen mostly synchronously.
The only exceptions to that currently are the http.batch()
function (that allows a single VU to make multiple concurrent requests) and websockets support in k6/ws
(which has its own local event loop).
So, if you have a long HTTP request followed by a short HTTP request, the RPS of the second can't really be higher than the rate per second of the first. It is basically coordinated omission and the only way to avoid it is to have the requests be executed completely independently from each other, with an arrival-rate executor and in different scenarios. Even http.batch()
wouldn't be enough - while the two requests will be executed independently inside of the http.batch()
call, the call itself wouldn't finish (and k6 wouldn't continue with the execution) until all of the requests in the http.batch()
call are finished...
Every VU executing this iteration will first make a request to https://test-api.k6.io/, fully wait for it to finish and read its response, and make a request to https://test.k6.io/ after that, wait for that to finish, sleep for 100ms, and only then finish the iteration.
That matches exactly my expectations!
k6 itself doesn't operate in 1-second time windows. The 3 important concepts in k6 you need to know are VUs (virtual users), iterations and executors:
Thanks, great, then I understood most of the things right.
So, if you have a long HTTP request followed by a short HTTP request, the RPS of the second can't really be higher than the rate per second of the first.
I see the point. The second request is "bottle-necked" by the first request. Let us ignore k6 internals for a moment and make some assumptions:
The use case I see is to have a user flow (= a bunch of synchronous requests). While the requests fired, I want to measure not only latencies, but also rps of every involved endpoint. In fact, I can do that with k6 already now! For example, to do that I:
login.js
. Every VU reads this file, submit login form, get token/cookie and store it in session file. At this point I get rps for login form.update-profile.js
. Every VU reads the session file and set user's name. Now I get rps for update profile endpoint.If we assume, that we can synchronize the above steps among VUs (= "barrier"), then we can get it in one test file. Of course there are some other bits like "does iteration belong to the steps or to whole script" and gotchas.
Now all assumptions off. It seems like this is not possible in k6 yet, because all VUs/Workers are independent. Do you think it makes sense for k6 to support this flow? If not, well, feel free to close this FR.
Thank you, @na-- !
As you've seen, k6 doesn't have file writing capabilities outside of handleSummary()
, so you'd have to use that xk6 extension to save the file between scripts, but otherwise your plan seems fine and should get you the results you want. With the caveat that the req/s will be calculated based on the whole script duration, so if you have some stragglers, they might skew the results heavily (which might or might not be what you want :man_shrugging:)
Regarding having everything in a single file, that seems somewhat possible to do currently with multiple scenarios. Scenarios don't need to be concurrent, they can be sequential with the startTime
property, and VUs are reused across scenarios when they don't overlap, so you can do something like this:
import http from 'k6/http'
// TODO: actually use something like this:
// consts users = JSON.parse(open('users.json'));
// and maybe a SharedArray, if the file is too big
const users = [
{ username: 'test1', password: 'pass1' },
{ username: 'test2', password: 'pass2' },
{ username: 'test3', password: 'pass3' },
{ username: 'test4', password: 'pass4' },
];
export let options = {
scenarios: {
login: {
executor: 'per-vu-iterations',
vus: users.length, // have exactly as many VUs as users
iterations: 1,
maxDuration: '15s',
gracefulStop: '0s',
exec: 'login',
},
update_profile: {
executor: 'per-vu-iterations',
vus: users.length, // have exactly as many VUs as users
iterations: 1,
startTime: '15s', // at least the (maxDuration+gracefulStop) of login
maxDuration: '15s',
gracefulStop: '0s',
exec: 'updateProfile',
},
},
thresholds: {
// Hack to surface these sub-metrics (https://github.com/k6io/docs/issues/205)
'http_req_duration{scenario:login}': ['max>=0'],
'http_req_duration{scenario:update_profile}': ['max>=0'],
'http_reqs{scenario:login}': ['count>=0'],
'http_reqs{scenario:update_profile}': ['count>=0'],
},
}
export function login() {
let credentials = users[__VU - 1];
console.log(`User ${__VU} logging in with ${JSON.stringify(credentials)}...`);
// TODO: actually implement login logic, this is only for demonstration
let randomDelay = Math.random().toFixed(2);
http.get(`https://httpbin.test.k6.io/delay/${randomDelay}?username=${credentials.username}`);
let bogusToken = `vu${__VU}_${randomDelay}`;
credentials.token = bogusToken;
}
export function updateProfile() {
let credentials = users[__VU - 1];
console.log(`User ${__VU} updating their profile ${JSON.stringify(credentials)}...`);
// TODO: actually implement update profile logic, this is only for demonstration
let randomDelay = (3 * Math.random()).toFixed(2); // longer requests
http.get(`https://httpbin.test.k6.io/delay/${randomDelay}?token=${credentials.token}`);
}
This has some other rough edges though:
startTime
, and if the first scenario finishes too quickly, k6 will sit idle for a while until the time for the second scenario comes; should be solved when something like startAfter
from https://github.com/k6io/k6/issues/1342 is implementedper-vu-iterations
executor with vus
equal to the length of the data set might result in more simultaneous requests than you'd like... shared-iterations
will be a better fit here, but then you can't really get the unique data rows from users
for every iteration with just the current __VU
and __ITER
execution context variables. We plan to fix that with https://github.com/k6io/k6/issues/1320, parts of which are already being implemented in https://github.com/k6io/k6/pull/1863/ and should be released in k6 v0.33.0 in about a month. Until then, if you want to reduce the vus
but still have a lot of test data, there is a workaround for this where per-vu-iterations
can work like shared-iterations
which I demonstrated in this forum topic.http_reqs/s
are calculated based on the whole test run duration, not in 1-second intervals you want, so you'd still need an external output.
Feature Description
I like the output from
ab
and I have a hard time to understand the metrics from k6. Especially I miss the rps metric, if I want to test a user flow. For example, my user flow consists of 5 stepsI managed to get more submetrics:
which shows latencies for every step. Great! If I look at "iterations [...] 29.437014/s", then my assumption is, that the system can handle 29.4 registrations per second, also meaning all five steps. Probably, it is the rps metric if there were only one step, right? If I use the same technique for
iterations
orhttp_reqs
then it doesn't print it or print not meaningful data:Latencies are great and give very useful insights. But from my point of view they are not enough to understand the capabilities of a system. For example, latency of 200ms is great for 5000rps, but bad for 5rps.
Suggested Solution (optional)
Probably it would be nice to have same behavior for
http_reqs
(oriterations
?) like forhttp_req_duration
:I looked at custom metrics, but didn't found a way to achieve it with them.