Distributed Vegeta - Githubissues

ivanilves commented 5 years ago

Hi guys,

First, great piece of software, thank you <3

Second, could you please tell me, is there any recommended way to run distributed Vegeta on a multi node cluster (EC2 ASG, K8s, Mesos or whatever to stress test at a scale) and “queue” attack targets somehow? Is it DIY only for now? ;)

Thank you!

tsenart commented 5 years ago

It's DYI for now indeed, although there are others who've solved the same problem before.

tsenart commented 5 years ago

I have ideas on how to make this easier. But no plans to implement anything for now (not enough time currently).

Vegeta K8S Operator
Vegeta Serverless (Schedule concurrent attacks on AWS Lambdas, Google Cloud Functions, etc).

ivanilves commented 5 years ago

Great. Thank you for these tips! 👍🏻👏

nitishm commented 5 years ago

How do you envision using operators ? What would the CRD specify ?

It would be nice to start with a simple master/worker model, deploying workers as a daemonset across nodes. Distributed attack could be using pdsh like you mentioned in your README.md or a synchronizing mechanism like 0mq/redis-pub-sub/etc

tsenart commented 5 years ago

What would the CRD specify ?

I don't know. There are many possibilities, but it'd be nice to have a history of load tests available as well as the results of each.

MalloZup commented 5 years ago

hi all i'm starting the vegeta-operator for k8s project here: https://github.com/MalloZup/vegeta-operator.

Feel free to join effort and help there. :santa: :mrs_claus: :sunflower:

Any contribution/help is welcome. :sunflower: i"m open for any idea/interaction and suggestion from community.

At moment i plan to learn the k8s operator api as fist step :smile: :robot: :rocket: so let's see :)

Roadmap proposal feel free to write here and help vegeta to become a super-sayan :) (https://github.com/MalloZup/vegeta-operator/issues/2)

nitishm commented 5 years ago

My thought is to use a kubernetes operator (acting as the master, driven by CRDs and a worker replicationset for horizontal scalability (via multiple replica pods) model.

The operator will be responsible for executing tests by sending requests (gRPC/HTTP) to the workers and retrieving the results. It will also be responsible for storing the aggregated results, command history, etc., in a key-value DB (pick anyone solution).

@tsenart WDYT ? Is that what you envisioned as well ?

ivanilves commented 5 years ago

@nitishm (just my 2 cents, nothing more):

Maybe using Kubernetes Job resource would lead us to easier design? https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/#running-an-example-job

We can scale jobs as well with parallelism https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/#controlling-parallelism

To me Vegeta tests are naturally more "job" kind of workload rather than a "worker/replicaset" one.

But as well as you I would like to know what @tsenart thinks about it :wink:

tsenart commented 5 years ago

To me Vegeta tests are naturally more "job" kind of workload rather than a "worker/replicaset" one.

Agreed.

nitishm commented 5 years ago

I agree with jobs as well. My thought process was driven by this article on using locust on GKE https://cloud.google.com/solutions/distributed-load-testing-using-kubernetes.

MalloZup commented 5 years ago

hi all, so since i did some experiment with the vegeta-controller/operator here my 2cents:

i agree with @ivanilves to use k8s Job. Indeed i planned to use them for scheduling.

I think as 0.1 version or minimal version this could be the first thing.

This was the API i was thinking: https://github.com/MalloZup/vegeta-operator/blob/master/config/samples/vegeta_v1beta1_vegeta.yaml. ( feel free for suggestions :octocat: )

Concerning logs, for the 0.1 version i think we can be experimental and rely on k8s log. Which is not permanent but it is a 0.1 right? I think we should more focus on API/CRD then the tooling around.

AN user could also add the needed logging via k8s with other operators etc. for working around

Imho later on we could think about storing logging, etc.

About communication, i think we don't need to invent much, we can rely on the kubernetes go client and within we can schedule jobs. We can also then rely on the k8s load-balancer etc.

I think yop, is more a vegeta-controller then operator, meaning there is an open question, which resource should we watch and react with callbacks ? With this design we don't have any resource where we would call reconcile, or i am missing something? :thinking: :smile:

tsenart commented 5 years ago

I think yop, is more a vegeta-controller then operator, meaning there is an open question, which resource should we watch and react with callbacks ? With this design we don't have any resource where we would call reconcile, or i am missing something? 🤔 😄

Wouldn't we create an attack object with certain parameters that the operator would have to satisfy?

MalloZup commented 5 years ago

@tsenart from your POV and experience which paramater this? tia

tsenart commented 5 years ago

@tsenart from your POV and experience which paramater this? tia

All the flags that you can define in vegeta attack, essentially.

domleb commented 5 years ago

FWIW I use k8s jobs to run distributed Vegeta tests (we achieved 1M tps using NLBs). I would say the responsibility of creating / deleting jobs etc shouldn't be with Vegeta. Whatever tooling is used to run any other job can be used. To make this really useful, it would be nice to use a Prometheus client to expose metrics. This is great way to aggregate metrics across all instances in a standard and scalable way. Plus alerts can used to automate testing for CI. Maybe there's already a way to hook in a Prometheus client but after a quick check I can't see an obvious way.

ghost commented 4 years ago

At my company, we uses a kubernetes custom controller that simply executes vegeta as kubernetes job. https://github.com/kaidotdev/vegeta-controller

I've developed a project similar to https://github.com/MalloZup/vegeta-operator without any plan, I'm looking forward to your feedback.

nitishm commented 4 years ago

@kaidotdev How do you scale the attacks ? Or is a single attack carried out by a single job pod ?

ghost commented 4 years ago

Yes, each attack runs as a single job pod. And can be scaled from spec.parallelism, which is passed through to Job spec.parallelism as is. c.f. https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.16/#jobspec-v1-batch

nitishm commented 4 years ago

Parallelism means scale out on performing the task. But how do you synchronize the attacks ? Having 3 jobs start up at different times would lead to a slew and inaccurate results if the attack was launched to be run for a fixed interval. How do you address that ?

ghost commented 4 years ago

Unfortunately, it is not guaranteed so far. In the approach using kubernetes job, it will be difficult to guarantee it due to the reconcile loop of kubernetes.

nitishm commented 4 years ago

Yea that’s where I got stuck and gave up on the task too. Maybe there is a possibility to sync jobs using specialized workloads deployed as part of the jobs but it’s up for investigation.

ghost commented 4 years ago

Hmm... Certainly, if we try to synchronize attack, it seems that we need to fundamentally change the approach. And then we probably need to combine the its results.

nitishm commented 4 years ago

@kaidotdev I create https://github.com/nitishm/vegeta-server as an attempt to address this problem. Just waiting for some time to free up to either use it directly as a pod on k8s or reinstrument some of the code to make it cloud-native. If you are interested in pursuing this further I would love to collaborate!

dastergon commented 4 years ago

My take on the vegeta-operator https://github.com/dastergon/vegeta-operator. It supports most of Vegeta's features and it has the ability to store the reports in AWS S3 (for now) via rclone. I would love to hear your feedback. Also, pull requests are always welcome! :)

fgiloux commented 3 years ago

I have created another operator for vegeta: https://github.com/fgiloux/vegeta-operator It is also leveraging the operator-sdk, similar to what @dastergon did, just on a newer version. I did look at what @dastergon wrote but wanted to make a few things differently:

it directly uses pods (no jobs) as sub-resources as I did not see the value of a mechanism for dealing with pod failure. IMHO the complete test needs to be relaunched if one of the vegeta attack fails. This may need upper level coordination (CI).
it supports object storage and persistent volumes for storing results / reports.
it is possible to launch distributed attacks (multiple pods). The report then gets generated in a separate step when all attack pods have completed
there is an image bundle for simple installation with OLM

A few things are not implemented:

Integration with Prometheus. There is a separate issue for that but it seems to be stuck.
Different output formats for the reports (this could however get easily added if there is a need)

Feedback welcome!

tsenart / vegeta

Distributed Vegeta #336