Few suggestions to increase reliability

franz1981 commented 3 weeks ago

There are few things which can affect the reliability of measures here:

GC ergonomics: we need to hard set the GC algorithm or the JVM ergonomics on the number of cores could change it, impacting pretty much all metrics. And some of the metrics could be surprisingly different
Use --cpu-set to establish a recommended amount of cores (3 should be the minimum for JVM mode, given that JIT compilation at startup hit very badly)
Setting min/max heap size is a good idea, in general, but can be a double edge sword too. The frequency of GCs and live set duration (how long your objects lifecycle really is) can make more frequent GCs keeping the RSS lower (which is expanded based on GC frequency vs estimated required additional memory) . My suggestions to do this right/reliable are not easy to summarize in few lines, tbh, but you can check the required heap for a minimum number of GC cycles (using parallel or G1, forcibly) for both the applications and AlwaysPretouch (it will make the RSS due to heap to always be fixed and the same - but the framework which allocate too much ehm ehm, will have a startup hit, because the memory is not enough and more GC activity is required). This will still make clear that given the same memory constains, one of the 2 framework is a clear winner.
The startup time is never the one of the application framework but the time for the first succesfull request and to do it "right" you should have a sync && drop_caches option to make sure the OS page cache won't save you to read files from real, from the disk (@edeandrea knows how to do it for MacOS too). In a real deployment with container it will always happen (and it has some devastating impacts on laptops, you can try it) - and in theory with docker, too, but doing it regardless is the safer choice AFAIK

cescoffier commented 3 weeks ago

1) I was thinking of using parallel.

2) We cannot set cpu-set as the number of cores associated with the GitHub agent is not fixed

3) I always do it to have the same value, which may be good or bad for all variants.

4) I do not compute the startup time. That's where we need specific infrastructure, as hammering the service will just overload the OS queues and report wrong numbers.

franz1981 commented 3 weeks ago

We cannot set cpu-set as the number of cores associated with the GitHub agent is not fixed

So let's go with one core and accept the JIT slowness. The different number of cores affect how much spare cpu cycles can be used to handle GC (unless SerialGC is used, really) because the same cpus will be shared for both application, GC and JIT tasks. Said that, it should affect the numbers, but not the "trends" i.e. a framework which is slow to start because allocate too much, use reflection or just perform wasteless work will just become slower, especially relatively to another which does all the other things right (guess who? ihih)

I always do it to have the same value, which may be good or bad for all variants.

Without AlwaysPretouch, the actual heap used depends too much by the GC - and the RSS by consequence. My suggestion is to limit the heap size AND use AlwaysPretouch, sizing it to match the minimum heap required to run decently the cheaper framework e.g. test quarkus with X MB max heap AND AlwaysPretouch and observe the minimum value which doesn't affect the load, than use that value for the other framework - and have fun :D

I do not compute the startup time. That's where we need specific infrastructure, as hammering the service will just overload the OS queues and report wrong numbers.

:+1:

cescoffier / spring-to-quarkus-demo

Few suggestions to increase reliability #14