Open waghanza opened 4 years ago
Do you wish to force the use of json libraries and/or json framework handling then? Otherwise can just return strings, which will be just as fast as before?
Yes. With such a system we have to add a dependency (ideally the same lib/built-in for each frameworks per language).
The purpose here is more to have a standard implementation, I mean something like in real-word.
I'm not convinced that plain/text
API is really use in production nowadays, and json
(even not the fastest format to parse/emit) is a kind of standard.
We also can use any standard format, such as jsonapi
or else.
I'm open to any suggestions :heart:
PS : The infamous hello world seems to have the same problem. Which production API has this functionality
In addition what I've said, I think @ohler55 is working on a graphql
implementation of this kind of benchmark.
The idea is to have this repo, as the benchmarking tool, and one repo per implementation (simple api, graphql, ...)
Indeed there is a GraphQL benchmark started. I'd love to have a few more examples if anyone is willing to contribute. https://github.com/ohler55/graphql-benchmarks
Sure, I can contribute with some python
, ruby
and crystal
You can move it here @ohler55 if you want to combien results 😀
I was hoping to hand it over at some point. I'm not sure if it make sense to combine everything into one large table or to keep GraphQL separate. Kind of feels like separate is better. I would be glad to transfer the repo to the-benchmarker though. Maybe that's what you had in mind. :-)
Yep @ohler55, but I do not want to mix results. What I have in my mind is to have one repo per (what so called in tfb) scenario
Not sure what 'tfb' is but glad you want to keep them separate.
So, any time, you are ready for me to transfer ownership?
Tfb is https://github.com/TechEmpower/FrameworkBenchmarks
Yes, I'm ready if you want to transfer ownership
Looks like I need permissions to create repos on the-benchmarker to complete the transfer.
I've granted you more privileges
Transfer complete!
I think I will implement this after next release (end of year). If anyone has some ideas, feel free to share :heart:
Also, I think an authentication system should be implemented.
Any hard-coded string send as a token could be enough. The difficulty here
is to avoid having complex implementations.
Getting into authentication is non-trivial. An arbitrary string is not really representative of the real world. Task OAuth2 as an example. A token server would be needed and the server would be expected to interact with that server. You start testing something much more than the web server at that point.
It might be interesting to compare different auth approaches and servers but thats probably better done as a separate repo. Another benchmarked repo of course.
Passing this information from the simulate database delay issue:
Essentially, in order to know the true performance of a web server, it is important to know how it handles asynchronous tasks(waiting on the database, calling an api, etc). Most web servers do not have compute heavy workloads. They have delay heavy workloads with a little bit of data serialization at the end. I propose that one of the endpoints that get tested, have the web server sleep for x
amount of milliseconds and then return a simple payload. This will simulate database calls.
The added endpoint could literally just be /sleep/:time_ms
. This way it is one single endpoint that can be tuned for any amount of delay( i.e. simple: 1-10 ms, normal: 10 - 20 ms, complex: 50-100 ms). Could even define categories based of local vs cloud db expected latency. Of course it doesn't need to get anywhere near that complicated, but could enable users to test different delays for themselves.
PUT
and POST
should also have some sort of basic validation to ensure all expected keys exists within the received JSON.
I believe the purpose of these benchmarks are to compare servers. By adding additional non-server code you take away from the server comparison and start to get into how well someone writes an application which is outside the scope of the server comparison and more in the realm of the applications which can often be used with multiple servers.
To be precise the purpose is to compare performance of server-side frameworks.
Their is two thing to notice.
Two things are required for me :
Having a delay to simulate database connection, or using a custom emitter/parser (json) is, after thinking about it, not necessary (even a kind of obvious).
For the database, as @OvermindDL1 says once, calls are mostly made asynchronously. By this I mean, having a delay will not, I think, change the comparison. In other terms, implement a delay will not be valuable regarding our objective.
For the format (json), I'm not convinced that this will change the results regarding our objective. Plain text is enough for me
I am not sure you can have a realistic and real-life like benchmark if you ignore both json and asyncronous performance. Even if database calls are generally handled asyncronously, some servers are synchronous and do not support this. Also, some of the servers will perform poorly when given a proper async task because their evented io engines are not good.
I think you will find that the higher performance server in the results do support asynchronous concurrent requests and the testing tools do open 1000 connections and pipeline requests so asynchronous request handling and concurrent requests are already measure without having to introduce a delay on responses which would invalidate any latency measurements.
For JSON, most if not all servers accept text whether that be JSON, XML, or just random strings. After that is up a a JSON, XML, or other parser to extract data. That is not part of a server evaluation but of a parser performance which can and are benchmarked separately.
Consider what would help the most when building a web application. There are choices for web server, databases, parsers, encoders, and more. I you try to compare every combination of all those features you would have millions of combinations to compare. Hardy reasonable. Now instead if you benchmarks each component separately you can determine which is the best in class for each component and put together your application according. The benchmarks here are for servers. Adding additional components to the mix only obfuscates the results regarding the servers themselves.
We are on the same line @ohler55.
What about testing all HTTP verbs ? Some routers could behave differently for POST
, PUT
...
That could be added. I guess it depends on how vital each verb is to applications. For example, PATCH can be used to modify but so can PUT in a REST model. Getting more esoteric, the CONNECT method is probably not supported by most of the server since it is really for a proxy and SSL tunnelling. Then TRACE for loop back testing is not really something an application needs but could be passed through to the underlying application code. It might not be as simple as it sounds at first blush to define something reasonable for each.
I think that
So it means, that you should cover
wdyt ?
Sure with a few suggestions to fix the HTTP method usage. PUT is usually used for creation so that should probably be on the list. GET should never modify and be read only. In addition, POST is used when modifying something on the server side and is an indication of the request being not idempotent so a simple retrieve is probably not the right functionality. Something like incrementing a counter would be more appropriate with the return value being a total. So something like this:
There will be a few side effects to implementing the listed methods or verbs. If you check the backend state then the use of workers will not be an option unless there is also a backend store which for reasons mentioned earlier overshadows the server testing or if no backend store it hides the fact that some servers support multiple workers. Basically some planning is needed to benchmark what you really care about.
The second side-effect of the change is that someone is going to have to go over all the existing code and add the new features. :-)
If you check the backend state then the use of workers will not be an option unless there is also a backend store which for reasons mentioned earlier overshadows the server testing or if no backend store it hides the fact that some servers support multiple workers
You mention that the results could be misleading
depending of how the servers are configured. I think so, but this on the purpose. What is in my mind :
servers
)The second side-effect of the change is that someone is going to have to go over all the existing code and add the new features. :-)
I'll handle it -> I'll ask help from any authors / contributors since I do not code is scala
(example)
@ohler55
PUT is usually used for creation so that should probably be on the list. GET should never modify and be read only. In addition, POST is used when modifying
there is RFC to address this, to say what each method should do
https://tools.ietf.org/html/rfc7231#page-24 https://tools.ietf.org/html/rfc2616#section-9.5
See that the use of each method is covered in the RFCs above, and we should follow this, or else rewrite these RFCs.
Just to say that POST is for update and PUT is create is to go against these RFCs, and the WEB forms would all be wrong, wouldn't they?
POST and PUT are methods determined by the server and is usually dependent on the Request-URI to determinate functionality.
I recommend caution when making definitions, the the-benchmarker/web-frameworks project is a good guide for strategic decisions and failing to follow standards can be a setback.
See a good example: https://www.yiiframework.com/doc/guide/2.0/en/rest-quick-start#trying-it-out
I stand corrected. The main point I was trying to make was that there were more HTTP method to consider that just the 4 common ones and that state when using multiple workers will need to be considered. Thank you for providing suggested behaviour for all the relevant methods.
HEAD
is probably the most common verb, most browsers and connections submit a HEAD request for viability for a lot of things. Some servers will optimize for HEAD calls or not, some will just call GET
and drop the body, etc... Could be good to check for that too.
I think you will find most pass the request on to the implementation of the app since the server can't really guess what the app will do. I have no problem if you want to include it though. It's just not really a server test but an app implementation test.
Is there a warm-up phase? Java implementation needs a warm-up phase to get more stable results. For example Jooby is sometimes between 5..7, some other around the 15...18 rank.
Here are some details around JVM warm-up: https://stackoverflow.com/questions/36198278/why-does-the-jvm-require-warmup
not very the scope of this PR
, I think, but yes.
The same methodology as techempower is used
Hi,
The implementations here are very simple, this is on purpose.
The routes to implement here are :
GET
/
, SHOULD return a successful status code and an empty stringGET
/user/:id
, SHOULD return a successful status code and the idPOST
/user
, SHOULD return a successful status code and an empty stringI propose to update code base to have a test on all (most used) HTTP methods and
application/json
instead ofplain/text
:GET
/user/1
, SHOULD return ajson
representation of a userPOST
,/user
, SHOULD return a successful status code when sending ajson
representation of a userPUT
,/user/1
, SHOULD return a successful status code when sending a completejson
representation of a userPATCH
,/user/1
, SHOULD return a successful status code when sending a partialjson
representation of a userDELETE
,/user/1
, SHOULD return a successful status codeThe idea is to be closest to a real-word reprensentation
Regards,