jcheron commented 4 years ago

First of all, I would like to salute the great work of the whole TechEmpower team:

For the integration of the new functionalities:

the new Tags (TPR, Broken...)
the composite score ranking
the possibility to consult preliminary datas during the run
For his daily work:
- review and validation of PRs, response to issues, managment of permanent runs...

After waxing the boots of the whole team, I can get to the subject of this post: The filtering of the composite score is misleading and confusing:

Steps to reproduce behavior

This is just an example: Let's say I want to get a ranking of the frameworks following a Full ORM implementation on the run f2c06cc9.

I apply the Full ORM filter and display the composite score:

Actual Behavior

Frameworks are filtered and only those with a full ORM implementation are kept, but the ranking does not give any indication about the performance of these frameworks with this full ORM implementation (excluding Micro and Raw scores).

So, a framework with poor performance in this area may even appear at the top of the list (This is a possibility, not necessarily a reality.).

Expected Behavior

If it is possible to use filtering when the composite score is active, the score calculation should take this filtering into account: which would give for example for actix :

Note: I have no problem with Rust and Actix, on the contrary ; and even in Full ORM, actix still performs very well.

And for the top 15 ranking (always with the Full ORM filter)

The composite score takes into account all implementations, and that's a good thing. If its calculation took into account the applied filters, it would be even better! This revised ranking gives a better approach of the performance in a specific case selected (filtered).

Other details

Through my approach, I am not trying to promote one of my favorite frameworks, rather to improve the readability and usability of the results, which benefits everyone.

It is possible that this functional modification, which seems insignificant but precious, requires a technical work of which I cannot imagine the extent. I leave it to the Techempower team to judge.

bhauer commented 4 years ago

This is a great issue report, thank you! The diagrams are a delight.

You're correct, the composite scores naively/simply use the best performing permutation from a framework, in each test type, regardless of the filters that have been selected. It should be possible to apply the filters to down-select the possible permutations prior to finding those "bests" per framework. This should accomplish the behavior you expect.

I won't be able to get to this soon, but I will put it on my to-do list!

jcheron commented 4 years ago

Thank you for this answer! If I can relieve you of something in this to-do list, it will be a pleasure.

joanhey commented 4 years ago

The composite score is new and in a working process. The developers need more information to adjust the benchmark_config.json .

Missing a lot of platforms

After check #6018, now the just platform will appear in the composite score, but not nodejs, than it's a js platform too.

Why? Because historically the platforms use "None" as framework value in benchmark_config.json. So not only didn't appear nodejs, but also PHP, Phyton, Crystal, Workerman, ....

Solution: Add the platform name as framework value.

Mixing platforms and frameworks

Some frameworks use the same name in the config for the platform and the frameworks.

Example 1:

Beetlex use the same name as framework value for both, the platform and the full-stack framework. But use the name beetlex for full framework and beetlex-core for the platform.

So in the composite score both show only 1 time as beetlex and with the values of the platform, of course as it's faster.

That is a problem for the core developers and users, as they can't compare both in an easy way.

Solution: If exist a platform and (micro-full) framework change the framework name. Possibly add a suffix like core for platforms.

Example 2:

Asp.net core C# have about 26 variants with the same name in framework value and only show 1 time in the composite score. The benchmark is a really good tool to optimize the performance of the frameworks. So the developers can decide if they want to separate the platforms variants (rhtx, ado, ...) but always it's useful.

But as a minimum, they need to separate the micro and full frameworks (MVC and MiddleWare) to help them and the users. Still will ony show the fastest variant of the micro framework, that is what we want in this score.

Solution: Add the framework to the platform (platform framework).

Conclusion

With that simple changes in the benchmark_config.json we solve half of this issue. And the composite score will be more useful for everyone.

PD: Will be good that on mouse hover the name show a tooltip with minimum the language and the lower classification (platform, micro, full)

joanhey commented 4 years ago

Also will appear now in the filtering in platforms and frameworks. But this happen now with others. As the first line of the benchmark say: "... platforms, full-stack frameworks, and micro-frameworks (collectively, "frameworks").".

Added PRs:

6033 for Nodejs so will compare with the Just platform

6032 for Workwerman and php-ngx

TechEmpower / FrameworkBenchmarks

Filtering and composite score #6024