openzipkin-attic / sleuth-webmvc-example

See how much time Spring Boot services spend on an http request.
Apache License 2.0
175 stars 107 forks source link

docker webmvc-example #27

Closed codefromthecrypt closed 5 years ago

codefromthecrypt commented 5 years ago

Currently, when I do performance troubleshooting, I start frontend and backend locally and use wrk -t4 -c128 -d100s http://localhost:8081 --latency against our docker-compose setup which includes grafana. I then look at the results.

Some of our stuff uses cloud and also docker internal networks are a bit weird. It seems like we could create and openzipkin/example-frontend and openzipkin/example-backend image which could then be deployed and tweaked as necessary. These literally could be added to our default docker-compose file for people who don't already have apps also.. or at least an additional compose file could add them.

I'm big in favor of literally using this repo vs something less used, relevant or tested. Sleuth report more concerns than many people do, and we have a lot of branches here which means we can switch to working and relevant config somewhat quickly. I want to re-use energy we already spend here (referencing this repo several times a week) vs creating something similar that people don't use today.

cc @openzipkin/core

codefromthecrypt commented 5 years ago

note it has been our convention to keep docker images in separate repos for most of our things. However, I don't think we have to do the same here as this is not a real product anyway

anuraaga commented 5 years ago

Another idea may be to not use docker-compose, instead writing a Java main that uses testcontainers to start up dependencies, similar to the integration tests It would be really great to hit a button in IntelliJ to benchmark the current code without packaging anything, and it makes it easier to run the debugger or a profiler.

Also just want to mention that when I was generating a ton of load, brave-webmvc-example fell over but armeria-example could handle it, probably difference between HTTP/1 and 2. This helped me when doing real stress testing by mocking out elastic search. Something to keep in mind if we think such stress testing could be important here.

codefromthecrypt commented 5 years ago

well tbh I was using brave-webmvc-example in tests I mentioned, particularly the webmvc4 one which is servlet based. I didn't realize there was a falling over problem here, but that heh sounds like a problem of its own. I'll try it.

I agree that if we have containers we can also drive them with testcontainers instead.

codefromthecrypt commented 5 years ago

in the brave-webmvc-example, btw, it wasn't a perfect run, but we don't need the apps to have perfect runs to identify the failures we've found so far. Ex. the last one I posted has some socket errors along the way, but still managed to put a few hundred thousand requests in :P

$ wrk -t4 -c128 -d100s http://localhost:8081 --latency
Running 2m test @ http://localhost:8081
  4 threads and 128 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    49.31ms   44.86ms 606.22ms   79.41%
    Req/Sec   766.32    118.24     1.13k    71.60%
  Latency Distribution
     50%   38.65ms
     75%   68.61ms
     90%  107.02ms
     99%  206.80ms
  305309 requests in 1.67m, 52.99MB read
  Socket errors: connect 0, read 50, write 17, timeout 0
Requests/sec:   3050.20
Transfer/sec:    542.13KB
codefromthecrypt commented 5 years ago

probably some of the slowness in sleuth will be how much work it is doing, ex logging to console etc. we can address this sort of stuff if interferes with the goal. Meanwhile obviously don't limit the tools you use to this issue @anuraaga .. I'm looking more for the camry than the ferrari because it solves multiple purposes vs high load tests.

codefromthecrypt commented 5 years ago

and definitely you are right. with out-of-box sleuth with the config we have here including log correlation etc, vs the stripped down brave example and certainly vs the armeria one.. this wouldn't be optimizing for the highest amount of requests per thread!

only getting about 60k in the run I just did. Though I would say the socket errors.. I've seen worse :P

Running 2m test @ http://localhost:8081
  4 threads and 128 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   208.32ms   55.08ms 640.47ms   71.84%
    Req/Sec   154.84     40.49   320.00     68.07%
  Latency Distribution
     50%  207.10ms
     75%  241.57ms
     90%  274.17ms
     99%  346.22ms
  61482 requests in 1.67m, 8.34MB read
  Socket errors: connect 0, read 53, write 0, timeout 0
Requests/sec:    614.34
Transfer/sec:     85.29KB
anuraaga commented 5 years ago

Yeah I tried with 128 threads where bravemvc's boot version looked like an Elasticsearch ;) But it would be pretty simple to swap the examples since the API is the same.

Wondering if we need docker for the examples. I imagine a project in the zipkin repo that depends on the Java of the examples, either using Jitpack or possibly publishing the examples to maven or something. It could then start up containers and directly start zipkin server (a project dependency in the zipkin repo so uses in-development code), and frontend / backend via its java dependency. wrk could also be a container or written with a simple "wrk" script. This project itself could be built into a docker image to run in the cloud.

codefromthecrypt commented 5 years ago

well, the main thing is if we aimed to run in ECS or whatnot and want people to not have to know IDE or java to play around. it seems tying hand behind back to not produce docker containers, especially considering we already do all our examples in docker anyway

codefromthecrypt commented 5 years ago

what I mean is that I often push people getting started to do like this.. ok use docker-compose to learn this thing now also run one of the examples :)

so yeah when I mention "camry" is it sortof like it would be nice to have a fluid thing where there's no break in order to get a story together. and that could be one simple way to get a working thing together in order to test anything from amazon deployment to whatever, and without having to rely on local build processes or IDEs... This was in back of brain when thinking about we can also use this for load since I also do exactly the same "2 step" thing so far, and have been able to find quite a few things despite it not being the highest performance.

that and remember we do have non-java folks around.. this can give working app without making container build work several ways. they can ignore that the test apps are written in java... don't have to use the polarizing build or IDE setup.

meanwhile it might not be exactly the tool you are looking for, so I don't think it would in anyway be helpful to say you need to work this.. just I don't think it is needed to design too much. While not great performance, these examples are highly configurable.. properties exposed for everything. Most of our examples are not, so anyway this is the rationale.

Positive we could have written the dockerfiles by now :)

anuraaga commented 5 years ago

Yeah that makes sense. I was thinking we would end up with a docker image packaging up all of zipkin-server, frontend, and backend, but since we already have a docker image with zipkin-server guess that's a weird dupe so separate seems good

astik commented 5 years ago

I don't know if it could be handy : https://github.com/astik/zipkin-demo It is a compilation of some work i'm doing to introduce Zipkin to my coworkers. Each root folder is a single simple app (producer and consumer are in separate apps). The folder _docker-compose contains all needed file to run each demo. I'm no docker expert, still demo are working correctly for now.

I hope it may help you =)

anuraaga commented 5 years ago

Happened to see that there already is a docker folder here. Is this issue solved if we split it up to have a frontend and backend image instead of just one?

codefromthecrypt commented 5 years ago

maybe too fancy but we could also sneak a look at container name and act accordingly. this would allow different versions to test easier

On Mon, Aug 26, 2019, 11:23 AM Anuraag Agrawal notifications@github.com wrote:

Happened to see that there already is a docker folder here. Is this issue solved if we split it up to have a frontend and backend tag?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/openzipkin/sleuth-webmvc-example/issues/27?email_source=notifications&email_token=AAAPVV2FXX2ZMSO4IFP27QLQGNEB3A5CNFSM4IMVZR72YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5DE2AY#issuecomment-524700931, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAPVV7NOY6ZCHZRQ4Y6UWDQGNEB3ANCNFSM4IMVZR7Q .

codefromthecrypt commented 5 years ago

by version I mean variant like locally tagging a variant of the build that uses kafka

On Mon, Aug 26, 2019, 11:26 AM Adrian Cole adrian.f.cole@gmail.com wrote:

maybe too fancy but we could also sneak a look at container name and act accordingly. this would allow different versions to test easier

On Mon, Aug 26, 2019, 11:23 AM Anuraag Agrawal notifications@github.com wrote:

Happened to see that there already is a docker folder here. Is this issue solved if we split it up to have a frontend and backend tag?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/openzipkin/sleuth-webmvc-example/issues/27?email_source=notifications&email_token=AAAPVV2FXX2ZMSO4IFP27QLQGNEB3A5CNFSM4IMVZR72YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5DE2AY#issuecomment-524700931, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAPVV7NOY6ZCHZRQ4Y6UWDQGNEB3ANCNFSM4IMVZR7Q .

codefromthecrypt commented 5 years ago

ps not tied to the idea anyway. ex I can always temporarily overwrite the tags