Open VladRodionov opened 5 months ago
This feature is essential for any application which works with off heap memory directly.
Will add unit tests.
Attention: Patch coverage is 0%
with 54 lines
in your changes missing coverage. Please review.
Project coverage is 57.88%. Comparing base (
c76455c
) to head (b904897
). Report is 10 commits behind head on master.:exclamation: Current head b904897 differs from pull request most recent head c75f02f
Please upload reports for the commit c75f02f to get more accurate results.
Sure, will add test this weekend. Thank you for the review @luben
These binary files should not be checked in git - I re-build them on each supported platform for each release.
Files have been removed.
Basing anything off sun.misc.Unsafe
behavior is not a good idea anymore. It has been deprecated since 2006.
There are 2 active JEPs that are almost done with their implementations and rollout in OpenJDK.
It s long way to go until all Java code with direct sun.misc.Unsafe
access will be ported to JDK 21+ (Java FFM), meanwhile we need to support JDK 11+ at least. Performance - wise Unsafe is still the champion, at least for direct memory access.
Performance wise, Unsafe no longer wins. Eclipse Jetty removed Unsafe a few year ago, and the various performance metrics has improved.
Jetty? Can it handle 500K+ RPS out of the box? Really doubt :). JFF is finally on par with JNI or slightly better, but for direct memory access and manipulations of bits and bytes outside of Java heap, Unsafe is the champ. And you missed my reqs - JDK 11+ support (actually Java 8+). Java 2024 report - almost 30% are still using Java 8, the rest - Java 11 and Java 17, all of them are missing JFFM support.
500K+ requests per second is not hard to do. You have to be mindful of network saturation in regards to request/response size and optional http details.
This has been done on an official release of Eclipse Jetty 10, and Jetty 11, and Jetty 12 servers (all of which do not have Unsafe operations anymore).
The setup is as follows ...
http
and server
modules is usually sufficient to hit 400K second. You can cross the 500K second threshold by turning off various http features (example: turn off the production of the Server
header, and Date
header). This setup results in sub 50 byte requests, and sub 200 byte responses (or about 120 bytes for request on network, and 280 bytes for response on network), which is only really useful for load testing the server for requests per second and latency metrics.
When I monitor (with something like wireshark) with 1 client to confirm its setup, I'm looking at the total bytes on the network and wanting something sub 400 bytes per request/response exchange and no FIN (we should be using persistent connections).
Hitting 510k requests per second is very attainable on a 10GbE network against a Jetty Server with a decent networking interface (some crappy 10GbE interfaces cannot get close to even 20% saturation).
Java 8 went EOSL in many contexts already. (eg: google cloud dropped it Jan 2024) Many Java 11 providers have it going EOSL at the end of this year too (eg: redhat in october, google in december)
https://medium.com/deno-the-complete-reference/netty-vs-jetty-hello-world-performance-e9ce990a9294
far from 510K RPS. May be its attainable, may be its - not. Not, I presume. Any Java network server which utilizes any type of thread pool executors will be handicapped due to significant thread context switch overhead. You are free to share links, which confirm, that 510K RPS is attainable for Jetty. I have not managed to find any proof of that statement, quite contrary, I found many benchmarks with a very abysmal performance and latency numbers. I am the developer of the Memcarrot
- memcached-compatible caching server, written in Java with a heavy dosage of sun.misc.Unsafe
. All memory management is manual (malloc(), free()). The server can run in less than 100MB of java heap while storing hundreds of millions of cached objects. Below are yesterday's test results (standard testing tool - memtier_benchmark
was used):
parallels@ubuntu-linux-22-04-02-desktop:~$ memtier_benchmark -p 11211 -P memcache_text --test-time=100
Writing results to stdout
[RUN #1] Preparing benchmark client...
[RUN #1] Launching threads now...
[RUN #1 100%, 100 secs] 0 threads: 53594005 ops, 535766 (avg: 535922) ops/sec, 15.35MB/sec (avg: 15.31MB/sec), 0.37 (avg: 0.37) msec latency
4 Threads
50 Connections per thread
100 Seconds
ALL STATS
============================================================================================================================
Type Ops/sec Hits/sec Misses/sec Avg. Latency p50 Latency p99 Latency p99.9 Latency KB/sec
----------------------------------------------------------------------------------------------------------------------------
Sets 48721.12 --- --- 0.37491 0.35900 0.66300 0.93500 3325.15
Gets 487201.32 634.00 486567.32 0.37314 0.35900 0.66300 0.94300 12355.39
Waits 0.00 --- --- --- --- --- --- ---
Totals 535922.44 634.00 486567.32 0.37331 0.35900 0.66300 0.94300 15680.54
This is 535K RPS with p99.9 latency less than 1ms. These numbers are within 5% of native memcached
. The test have been run on Mac Studio M1 (64GB RAM).
Other benchmark results (memory consumption, surprise, surprise) are here: https://github.com/carrotdata/membench
Memcarrot
will be released next week. sun.misc.Unsafe
made it possible. This is why we need direct access to off heap memory and I am not sure that the code can we rewritten with JFFM API.
https://medium.com/deno-the-complete-reference/netty-vs-jetty-hello-world-performance-e9ce990a9294
An unconfigured Jetty and testing on the same machine, that person just tested the performance of their localhost network stack, nothing else. That is a horrible set of tests and doesn't test performance of Jetty. Using jetty-maven-plugin:run which is focused on developer needs by its configuration, not performance. The configuration they used also did zero for tuning the http exchange. I bet their Jetty server was barely being used, they simply couldn't generate enough load (a super common scenario when attempting to load test on the same machine).
Any Java network server which utilizes any type of thread pool executors will be handicapped due to significant thread context switch overhead.
Jetty doesn't use native JVM thread pool executors, it's got it's own and a EatWhatYouKill model that minimizes thread context switching, we even see improvements on CPU caching with this model.
When we participated in the TechEmpower benchmarks years ago (back in Jetty 10.0.0 days) we were consistently in the top 5%, and when we learned the tricks of the those above us we could easily get into the top 3%, but those tricks were not representing real world scenarios.
I am the developer of the
Memcarrot
- memcached-compatible caching server, written in Java with a heavy dosage ofsun.misc.Unsafe
. All memory management is manual (malloc(), free()). The server can run in less than 100MB of java heap while storing hundreds of millions of cached objects. Below are yesterday's test results (standard testing tool -memtier_benchmark
was used):
Congrats, that's a really fantastic outcome.
Anyway, this is devolved into a totally different set of arguments. Do what you want. It is your repo after all.
Eclipse Jetty just has to monitor how the new JVMs react to our usage of the current state of zstd-jni. (so far it looks like we have to, at a minimum, document the demands that zstd-jni put to ByteBufferPool implementation, and the JVM command line switches necessary to allow zstd-jni to function.)
This PR introduces support for handling native memory buffers that are allocated using the
sun.misc.Unsafe.allocateMemory
API. With this update, it is now possible to compress and decompress data between two native memory buffers, as well as transfer data from a byte array to native memory and vice versa.