Open msmith-techempower opened 4 years ago
It turns out that openjdk:11.0.3-jre-stretch
and openjdk:11.0.3-jdk-stretch
are not the same underlying JRE (thanks to @nbrady-techempower for finding these):
openjdk:11.0.3-jre-stretch
:
openjdk:11.0.3-jdk-stretch
gemini
is not leaking connections in its plaintext
test (thanks to @michaelhixson for finding this):
Repro steps:
tfb --test gemini --mode debug
docker ps | grep gemini
to find container id
docker exec -it bash <container-id>
Inside the gemini-mysql container, run watch 'ss -tan | wc -l'
to continuously print out the total number of connections
From another bash session on host, run docker run --rm techempower/tfb.wrk wrk -H 'Host: host.docker.internal' -H 'Accept: application/json,text/html;q=0.9,application/xhtml+xml;q=0.9,application/xml;q=0.8,*/*;q=0.7' -H 'Connection: keep-alive' --latency -d 15 -c 512 --timeout 8 -t 8 "http://host.docker.internal:8080/update?queries=20"
In the terminal running the watch command, watch the number of connections climb continuously
The same pattern exists in php or nginx. And perhaps in more languages.
https://tfb-status.techempower.com/timeline/php/plaintext https://tfb-status.techempower.com/timeline/nginx/plaintext
And a big drop in 18 June, 2019. But I thought it was due to the CVE-2019-1147x patches. (but it's only for Microsoft systems)
Really good tool the Framework Timeline :wave: , Will be better with annotated marks about that big changes in the benchmark.
I have been investigating a estrange problem for some time. And after check the Timeline, curiously It also starts at 18 June, 2019.
In the last runs Kumbiaphp-raw is slower than Kumbiaphp with ORM. It does not make any sense, and I think it will affect the plain php also.
Fortunes Test | Round 18 | Actual runs |
---|---|---|
PHP | 129,288 | 95,832 |
Kumbiaphp raw | 90,377 | 73,245 |
Kumbiaphp orm | 76,710 | 73,752 |
https://tfb-status.techempower.com/timeline/php/fortune https://tfb-status.techempower.com/timeline/kumbiaphp-raw/fortune https://tfb-status.techempower.com/timeline/kumbiaphp/fortune
It's impossible for raw version to be slower than the ORM version, in all the runs after 18 June.
I was thinking with a bad php stack config. But after read this issue, I think that perhaps would be a problem with the benchmark stack. I'll investigate more about that problem.
@joanhey Below is the graph for Kumbiaphp, for reference, and it does indeed see that dip on June 18, 2019. Curiously, it seems to recover on Nov 20, 2019.
I have edited the original post to indicate that on Jun 18, 2019, @nbrady-techempower applied the Spectre/Meltdown kernel patches, and we believe that those account for the dip.
Yes it recover in Nov 20, like plain PHP. But I can't understand the reason. No changes in nginx config or php code, no new minor versions (PHP 7.3.x or nginx). In Jan 2020, use php 7.4 and we can see a small rise.
Curiously, nginx alone drop in Nov 20, 2019.
I believe we have an answer to that now.
Nov 20 is when we switched back from CentOS to Ubuntu, and we did not apply (this iptables rule)[https://news.ycombinator.com/item?id=20205566] which was previously applied on the CentOS install.
That dip from Jun 8 to Nov 20 appears to be a direct relation to that particular rule being in place.
I think that would be a timeline with all that changes in some place.
A chronological history of the changes in a web page.
https://github.com/TechEmpower/tfb-status/issues/21 Yes, I want that.
I was troubleshooting what I believed to be a performance degradation in
Gemini
(and spent a lot of time doing so) when I believe I came to the realization that it is a problem not in Gemini proper. This issue will lay out all the information we have gathered.For those unfamiliar, it is my pleasure to introduce the Framework Timeline which graphs the continuous benchmark results over time. This tool is great for illustrating the arguments that I will be laying out. This link is to the
plaintext
results forgemini
.The following is an annotated graph from
gemini
's Framework Timeline:Our best guess is that this is a dip from #4850 which changed the base image of many Java test implementations. The timing lines up pretty much exactly, though it is a bit of a mystery as to why moving fromFound an email chain wherein @nbrady-techempower confirmed that he once again applied Spectre/Meltdown patches and anopenjdk-11.0.3-jre-slim
toopenjdk-11.0.3-jdk-slim
would have a performance impact.iptables
rule from thisgemini
on Citrine (Ubuntu) - roughly 1.2Mplaintext
RPSgemini
on Citrine (Ubuntu) - roughly 700Kplaintext
RPSThe following shows the data table for Servlet frameworks written in Java for Round 18 published July 9, 2019 which is between number 6 and 7 on the above graph.
Comparing that with the data table for the same test implementations from the run completed on April 1, 2020 which is the last graphed day (as of this writing) on
gemini
's Framework Timeline.This shows degradation across the board for Java applications, but some are impacted more than others.
For comparison, the following is
servlet
'splaintext
Framework Timeline:We merged in some updates to Gemini today which included updating the Java base image to
openjdk-11.0.7-slim
which should be the same asopenjdk-11.0.7-jdk-slim
. So, if there was some weirdness withopenjdk-11.0.3-jdk-slim
from #4850 then the next run will show improvedplaintext
numbers for Gemini.However, that may be unrelated, so other tests I will probably do in the next hour or two:
[ ] - Downgrade
tapestry
toopenjdk:11.0.3-jre-stretch
which was the version prior to #4850 [ ] - Upgradewicket
toopenjdk:11.0.7-slim
which would eliminate any question ifgemini
improves andwicket
improves [X] - Verify versions ofopenjdk:11.0.3-jre-stretch
andopenjdk:11.0.3-jdk-stretch
have the same underlying JRE see below [X] - Verifygemini
plaintext are not leaking connections see below