arey / java-object-mapper-benchmark

JMH benchmark of Java object-to-object mapping frameworks
287 stars 53 forks source link

Reliable benchmarks without throttling side-effects #19

Open roookeee opened 5 years ago

roookeee commented 5 years ago

Most mappers (Orika, BULL etc.) in this repository had new releases since the last benchmark run from february - let's get some up to date data and update the readme's table + image.

roookeee commented 5 years ago

Anything you need here? I think you should be the one to run the benchmarks and create a new graph as you have the control over the environment then :)

arey commented 5 years ago

You're right. I will do soon the update.

arey commented 5 years ago

Last version of BULL has been compiled for Java 11:

/Users/arey/Dev/GitHub/java-object-mapper-benchmark/src/main/java/com/javaetmoi/benchmark/mapping/mapper/bull/BullMapper.java:[3,24] cannot access com.hotels.beans.BeanUtils
[ERROR]   bad class file: /Users/arey/.m2/repository/com/hotels/beans/bull-bean-transformer/1.5.1/bull-bean-transformer-1.5.1.jar(com/hotels/beans/BeanUtils.class)

Any idea @fborriello ?

fborriello commented 5 years ago

@arey yes it is, if you need the java 8 build, set the bull-version to: 1.1.24. The only difference between the two versions is that, the first one is compiled with java 11 and the second one with java 8

arey commented 5 years ago

Done. MapStruct and Selma are new before the manual version ...

roookeee commented 5 years ago

Hi @arey - that simply should not be the case - look at my results on an I7 6700k:

MapperBenchmark.mapper       Manual  thrpt    5  43199115,871 ± 2958841,336  ops/s
MapperBenchmark.mapper    MapStruct  thrpt    5  41722179,500 ±  216444,962  ops/s
MapperBenchmark.mapper        Selma  thrpt    5  41194963,858 ±  860577,996  ops/s
MapperBenchmark.mapper      JMapper  thrpt    5  32720748,374 ±  737046,085  ops/s
MapperBenchmark.mapper        datus  thrpt    5   8457751,577 ± 1303780,678  ops/s
MapperBenchmark.mapper        Orika  thrpt    5   4082450,031 ±  116644,367  ops/s
MapperBenchmark.mapper  ModelMapper  thrpt    5    178733,642 ±    7162,971  ops/s
MapperBenchmark.mapper         BULL  thrpt    5    154111,877 ±    2714,244  ops/s

As you use an Intel CPU please keep in mind that it throttles excessively not only based on temperature but even based on time (e.g. the maximum frequency is only held for at max 10 seconds, then you get throttling no matter the temp) on newer generation consumer intel cpus (especially in notebooks). For stable results please use a desktop pc + cpu with a fixed clock speed or something like this may arise again - it makes the results seem untrustworthy.

Kind regards, roookeee

filiphr commented 5 years ago

Does it make sense to run the benchmarks on a CI? For example on Azure Pipelines or Travis CI?

roookeee commented 5 years ago

That would surely make sense as those should be properly cooled server CPUs which are not configured with short-term max boosting in mind. But you would lose control on what exact hardware the tests were run (which RAM and CPU) ? (correct me if Azure or Travis allows for this)

arey commented 5 years ago

Thanks @roookeee for this point of attention. My Macbook Pro is a bit older (a mid-2014 model with an Intel Core i7) and I run the benchmark with the power adapter. I don't have any desktop (just at work but it it will be replaced by a virtual machine) On Windows 7, do you know if we could fix the clock speed?

roookeee commented 5 years ago

Can't help you with setting a fixed clock speed on windows - sorry. Although imperfect maybe using VirtualBox or some other virtualization and starting a Linux distro would work as limiting the cpu frequency in Linux is quite easy to achieve

wind57 commented 4 years ago

Hi @arey - that simply should not be the case - look at my results on an I7 6700k:

MapperBenchmark.mapper       Manual  thrpt    5  43199115,871 ± 2958841,336  ops/s
MapperBenchmark.mapper    MapStruct  thrpt    5  41722179,500 ±  216444,962  ops/s
MapperBenchmark.mapper        Selma  thrpt    5  41194963,858 ±  860577,996  ops/s
MapperBenchmark.mapper      JMapper  thrpt    5  32720748,374 ±  737046,085  ops/s
MapperBenchmark.mapper        datus  thrpt    5   8457751,577 ± 1303780,678  ops/s
MapperBenchmark.mapper        Orika  thrpt    5   4082450,031 ±  116644,367  ops/s
MapperBenchmark.mapper  ModelMapper  thrpt    5    178733,642 ±    7162,971  ops/s
MapperBenchmark.mapper         BULL  thrpt    5    154111,877 ±    2714,244  ops/s

As you use an Intel CPU please keep in mind that it throttles excessively not only based on temperature but even based on time (e.g. the maximum frequency is only held for at max 10 seconds, then you get throttling no matter the temp) on newer generation consumer intel cpus (especially in notebooks). For stable results please use a desktop pc + cpu with a fixed clock speed or something like this may arise again - it makes the results seem untrustworthy.

Kind regards, roookeee

these 10 seconds and temperature arguments require proper quotation, IMO. I admit, this is the very first time I hear about this. @roookeee

roookeee commented 4 years ago

https://en.wikichip.org/wiki/intel/thermal_velocity_boost

The exact number of bins and temperature depends on the processor (mobile, desktop \ 15W, 45W, 65W): For Coffee Lake H, TVB is +200 MHz if TCASE is at 50°C or lower and turbo power budget is available. For Coffee Lake R, Whiskey Lake U, and Comet Lake U , TVB is +100 MHz if TCASE is at 70°C or lower and turbo power budget is available.

History TVB was introduced with Coffee Lake H mobile parts. TVB was added to some desktop parts for the first in Coffee Lake R.

That's a good starting point to do research, Intel themself of course don't have public PDFs that read "yeah it will throttle in 10 secs because a notebook can't handle 20% power draw on all 8 cores"

The boost of 100-200MHZ may seem slow, but remember that the overheating will throttle below baseclock afterwards which will be a decrease from e.g. 4.2GHz to 2.8GHz.

But there is a reason why people make fun of Intels > 5 GHz claims that can only be hold for a couple of seconds on non-water cooled devices :)

arey commented 4 years ago

Do you think the benchmark launch by GitHub actions is more stable than those one of my macbook? https://github.com/arey/java-object-mapper-benchmark/actions/runs/114037025

We have some information about the OS but non on the hardware: https://github.com/arey/java-object-mapper-benchmark/runs/704112001?check_suite_focus=true

roookeee commented 4 years ago

The handwritten mapper should be within 3-5% of the mapper frameworks. This means that the results are "bad" again, as selma is outperforming everything when before it was not nearly the best :/

filiphr commented 4 years ago

Currently the benchmark is executed in parallel. Every mapper is executed on its own. Perhaps that leads to inconsistent results. Maybe we need to try running everything one by one on the same machine, similar to how @arey executes the logic locally on his laptop.

And I agree with @roookeee the handwritten mapper should be close to MapStruct and Selma. Both MapStruct and Selma generate Java code which should lead to similar benchmark for them.

arey commented 4 years ago

In the GitHub actions documentation I found the VM resources of the runners:

Do you think we could try to use an ARM32 architecture which should not based on Intel CPU? It's required a self-hosted runner.

roookeee commented 4 years ago

I would propose a sequential instead of a parallel build (as outlined by @filiphr). Furthermore I don't think ARM is a big target of Java and/or represents a big part of serverside / clientside Java programs. Having 2 cores and running mappers in parallel is bound to be affecting performance, so just do them one after another.

arey commented 4 years ago

Ok I'm trying a sequential benchmark: https://github.com/arey/java-object-mapper-benchmark/actions/runs/139813857

filiphr commented 4 years ago

Having 2 cores and running mappers in parallel is bound to be affecting performance, so just do them one after another.

Just FYI running in parallel means running on different runners. I doubt that actions are running in the same time on the same runner. However, GitHub must be doing something for resource sharing on their infrastructure.

filiphr commented 4 years ago

I just saw your commit @arey. What I meant when running sequentially was to run everything in one action. I think that running in different actions means that they are running on different runners, so there might be no difference between parallel or not

arey commented 4 years ago

You're right @filiphr the results are the same.