isaiah / jubilee

A rack server built upon vert.x
http://isaiah.github.io/jubilee
MIT License
327 stars 18 forks source link

Support for specifying number of instances? #2

Closed robertjpayne closed 10 years ago

robertjpayne commented 10 years ago

Vert.x command line allows one to specify the number of instances.

I saw a recent commit about removing a worker option since "Vertx can scale itself" but I see no increase in thoroughput or thread counts when hammering the server with thousands of requests/s.

Is this something that can be exposed via jubilee? Or does it even make sense, I'm new to Vertx so maybe I'm doing it wrong.

isaiah commented 10 years ago

The jubilee server runs on a embedded vert.x core, which can "not be able to benefit from the -instances option of the vertx command or in the PlatformManager API."

I did some performance improvement recently, the result is promising, on my desktop(osx) it reaches about 40K rps by the FrameworkBenchmarks rack suite (the current version 1.0.2 is about 9k rps). And I got similar performance on a Redhat linux server.

This feature is likely to be implemented in the future.

ryanstout commented 10 years ago

@isaiah Thanks for the great work on this. I'm noticed on the benchmarks page you have "concurrency" on the x axis. I'm curious how you changed that parameter.

ryanstout commented 10 years ago

I'm trying to figure out why I'm not able to max my cpu when using something like wrk. Thanks.

isaiah commented 10 years ago

@ryanstout The "concurrency" means the "-c Connections to keep open" option of wrk, maybe it's a bit misleading. Because Vertx only assigns one event loop to an address, it becomes the bottleneck, you probably noticed that only one of your cores' load is 100%.

ryanstout commented 10 years ago

Ok, so I guess the question then are you planning to allow it to bind more than one event loop to an address? Thanks.

isaiah commented 10 years ago

Yes, it's in the roadmap already.

ryanstout commented 10 years ago

Cool. This is a great project btw. I'm working on a new ruby web framework and considering using jruby and vertx as the backbone of sorts.

isaiah commented 10 years ago

Thanks, glad it helps.

ryanstout commented 10 years ago

Thanks for all of the hard work. Question, so should we be able to max our cpu now, or is there something else that needs to happen? (Say by setting -n to the number of cores)

robertjpayne commented 10 years ago

@ryanstout It really depends on if you're doing anything "threaded" inside your processes. Assuming you're just looking for a hello world test you should have 1 instance per available cpu core. A cpu core is what the OS sees not necessarily what the hardware actually is since most chips these days are giving the OS 2 cores per 1 physical core anyways.

Even so you may not max your CPU, in any good system you shouldn't max your CPU anyways because doing so makes the system impossible to fix should an error arise and the CPU is bottlenecked at 100% and thrashing the resources.

ryanstout commented 10 years ago

@robertjpayne So I have an i7 with 4 "cores", (or 8 with hyper-threading I think). But no matter what I set -n to, I still have a little less than 50% idle. I'm just doing a basic hello world test with wrk. I feel like it should max no problem in that situation, but maybe I'm missing something obvious.

robertjpayne commented 10 years ago

@ryanstout Vertx is an evented webserver and as such the CPU load is extremely minimal per request. The biggest thing that will prevent you from getting a CPU bottleneck on that is because Java's JIT optimiser that speeds up the more it gets hit.

What kind of req/s throughput are you getting? I mean really if you're over 20k req/s on a 'Hello World' sample you then you know the webserver is never going to be the issue...

ryanstout commented 10 years ago

@robertjpayne Thanks for the help. Yea, I guess I forgot that the main loop is evented, so its probably maxing pulling stuff off of the socket and putting it into the verticals. I'm getting about 40k/sec on my hello world. I'm less concerned with actual real world performance, than just showing off jruby in a benchmark, hehe.

isaiah commented 10 years ago

@ryanstout Did you get any performance improvement by setting the number of instances? In my test it's almost 3X faster than 1.1.3, and the overall load is higher. There is an eventloop for each instance, but the maximum number of event loops is 2*cores, in your case you should set -n to 16 to get the best performance.

ryanstout commented 10 years ago

@isaiah I didn't see any change. Which I thought was weird. Tried -n 16 as well. It seemed like -n wasn't working at all. Setting it to 1 didn't have any effect either, either way it maxed out about 50% of my total available cpu. I can run more tests if you want.

isaiah commented 10 years ago

@ryanstout Thanks for the feedback, I'll take a look.