This project provides integration modules for the invesdwin-context module system.
Releases and snapshots are deployed to this maven repository:
https://invesdwin.de/repo/invesdwin-oss-remote/
Dependency declaration:
<dependency>
<groupId>de.invesdwin</groupId>
<artifactId>invesdwin-context-integration-ws</artifactId>
<version>1.0.2</version><!---project.version.invesdwin-context-integration-parent-->
</dependency>
The core integration module resides in the invesdwin-context repository. You can read more about it there. The modules here build on top of it to provide more specific integration functionality:
de.invesdwin.context.integration.amqp.AmqpClientProperties.HOST=localhost
# -1 means to use default port
de.invesdwin.context.integration.amqp.AmqpClientProperties.PORT=-1
de.invesdwin.context.integration.amqp.AmqpClientProperties.USER=guest
de.invesdwin.context.integration.amqp.AmqpClientProperties.PASSWORD=guest
/META-INF/batch/ctx.batch.*.xml
so they get automatically discovered during application bootstrap. The IJobService
provides convenience methods for running and managing these jobs. The invesdwin-context-integration-batch-admin
module embeds the spring-batch-admin web frontend into your application. It provides a convenient UI to manually fiddle around with your jobs on an ad-hoc basis (it uses the context path root at <WEBSERVER_BIND_URI>/
of your embedded web server).@Controller
for it to be automatically available under the context path <WEBSERVER_BIND_URI>/spring-web/...
of your embedded web server. Also this module adds in support for a web service registry via the IRegistryService
class which connects as a client to your own registry server (read more on that in the WS-Registry topic later). Lastly this module also adds support for SOAP web services via spring-ws. This allows for easy use of enterprise web services as spring integration channels (the context path is <WEBSERVER_BIND_URI>/spring-web/...
of your embedded web server). You can register your own web services in the registry server via the classes RestWebServicePublication
for REST and XsdWebServicePublication
for SOAP respectively. The class RegistryDestinationProvider
can be used to bind to an external web service that is looked up via the registry server. The following system properties are available to change your registry server url and to change the credentials for your web service security (message encryption is disabled per default since you should rather use an SSL secured web server (maybe even a proxy) for this as most other methods are slower and more complicated in comparison):
# it makes sense to put these settings into the $HOME/.invesdwin/system.properties file so they apply to all processes (alternatively override these in your distribution modules)
de.invesdwin.context.integration.ws.IntegrationWsProperties.REGISTRY_SERVER_URI=http://please.override.this/spring-web
de.invesdwin.context.integration.ws.IntegrationWsProperties.WSS_USERNAMETOKEN_USER=invesdwin
de.invesdwin.context.integration.ws.IntegrationWsProperties.WSS_USERNAMETOKEN_PASSWORD=invesdwin
de.invesdwin.context.integration.ws.IntegrationWsProperties.SPRING_WEB_USER=invesdwin
de.invesdwin.context.integration.ws.IntegrationWsProperties.SPRING_WEB_PASSWORD=invesdwin
@Named
spring beans additionally with @Path
to make expose them via JAX-RS as RESTful web services (the context path is <WEBSERVER_BIND_URI>/jersey/...
of your embedded web server).IContextLocation
for the invesdwin-context application bootstrap (the context path is <WEBSERVER_BIND_URI>/cxf/...
of your embedded web server).invesdwin-context-integration-ws-registry-dist
distribution module) and make sure all clients are configured to the appropriate server via the de.invesdwin.context.integration.ws.IntegrationWsProperties.REGISTRY_SERVER_URI
system property. The registry server is available at the context path <WEBSERVER_BIND_URI>/spring-web/registry/
of your embedded web server.de.invesdwin.context.integration.maven.MavenProperties.LOCAL_REPOSITORY_DIRECTORY=${user.home}/.m2/repository
de.invesdwin.context.integration.maven.MavenProperties.REMOTE_REPOSITORY_1_URL=http://invesdwin.de/artifactory/invesdwin-oss-remote
# add more remote repositories by incrementing the id (though keep them without gaps)
#de.invesdwin.context.integration.maven.MavenProperties.REMOTE_REPOSITORY_2_URL=
AffinityProperties.setEnabled(boolean)
to enable/disable the affinity during runtime without having to remove the affinity setup code during testing. Please read the original documentation of the library to learn how to adjust your threads to actually apply the affinity.ConfiguredJPPFClient
provides access to submitting jobs and among other things a JPPFProcessingThreadsCounter
which gives reliable information about how many drivers, nodes and processing threads are available in your grid. Also you can override any JPPF internal properties via the system properties override mechanisms provided by invesdwin-context. Though the only properties you should worry about changing would be the following ones:
# replace this with a secret token for preventing public access
de.invesdwin.context.integration.jppf.JPPFClientProperties.USERNAME_TOKEN=invesdwin
# enable local execution, disable it if you have a node running on this computer
jppf.local.execution.enabled=true
@JPPFServerTest
annotation. You need at least one server instance deployed in your grid. Though if multiple ones are deployed, they will add the benefit of improved reliability and more dynamic load balancing despite allowing more flexible control over the topology, since not all clients or nodes might be able to reach each other when multiple networks and firewalls are involved. In that case server instances can operate as bridges. The interesting properties here are:
# a local node reduces the overhead significantly, though you should disable it if a node is deployed on the same server or then when server should not execute jobs
jppf.local.node.enabled=true
# per default a random port will be picked
jppf.server.port=0
jppf.management.port=0
@JPPFNodeTest
annotation to enable a node in your unit tests. The node will cycle through the servers that were returned by a ws-registry lookup and connect to the first one that can be reached properly. If you want to reduce the overhead of the remote classloader a bit you can either deploy whole jars with your jobs (via the JPPF ClassPath mechanism) or directly deploying the most common classes embedded in your nodes JVM classpath. If you require a spring context in your tasks, you have to initialize and tear it down inside the task properly. If you require access to invesdwin modules that are accessible via the MergedContext, then you have to deploy them alongside your node module or use an isolated classloader in your task which initializes its own context. Nodes will publish their metadata to an FTP server (those modules will be explained later) so that counting nodes and processing threads is a bit more reliable and works even with offline nodes (nodes that are behind firewalls and thus invisible through JMX or which are explicitly in offline mode). These counts can then be used for manually splitting workloads over multiple jobs. For example when you want to reduce the JPPF overhead (of remote classloading and context setup) or even the context switches of threads inside the nodes, you can submit jobs that have only one task that contains a chunked bit of work. This chunked bit of work will be split among the available threads in the node inside the task (manually by your code) and then processed chunk-wise without the thread needing to pick its next work item, since all of them were given to it in the beginning. The task waits for all the work to be finished in order to compress and upload the results as a whole (which also saves time when uploading results). Since JPPF nodes can only process one job at a time (though multiple tasks in a job are handled more efficiently than multiple jobs; this limitation is due to potential issues with static fields and native libraries you might encounter when mixing various different tasks without knowing what they might do), this allows you more fine grained control to extract the most performance from your nodes depending on your workload. When you submit lots of jobs from your client, it will make sure that there are enough connections available for each job by scaling connections up dynamically within limits (because each job requires its separate connection in JPPF). This solves some technical limitations of JPPF for the benefit of staying flexible.FtpFileChannel
which provides a way to transmit information for a communication channel for FTP transfers by serializing the object. This is useful for JPPF because you can get faster result upload speeds by uploading to FTP and sending the FtpFileChannel
as a result of the computation to the client, which then downloads the results from the FTP server. By utilizing AsyncFileChannelUpload
you can even make this process asynchronous which enables the JPPF computation node to continue with the next task while uploading the results in parallel. The client which utilizes AsyncFileChannelDownload
will wait until the upload is finished by the node to start the download. Timeouts and retries are being applied to handle any failures that might happen. Though if this fails completely, the client should resubmit the job. The following system properties are available to configure the FTP credentials (you can override the FtpFileChannel.login()
method to use different credentials; FTP server discovery is supposed to happen via FtpServerDestinationProvider
as a ws-registry lookup):
de.invesdwin.context.integration.ftp.FtpClientProperties.USERNAME=invesdwin
de.invesdwin.context.integration.ftp.FtpClientProperties.PASSWORD=invesdwin
@FtpServerTest
to enable the server in your unit tests. The following system properties are available:
de.invesdwin.context.integration.ftp.server.FtpServerProperties.PORT=2221
de.invesdwin.context.integration.ftp.server.FtpServerProperties.MAX_THREADS=200
# set to clean the server directory regularly of old files, keep empty or unset to disable this feature
de.invesdwin.context.integration.ftp.server.FtpServerProperties.PURGE_FILES_OLDER_THAN_DURATION=1 DAYS
WebdavFileChannel
for use in AsyncFileChannelUpload
and AsyncFileChannelDownload
. The following system properties are available to configure the WebDAV credentials (you can override the WebdavFileChannel.login()
method to use different credentials; WebDAV server discovery is supposed to happen via WebdavServerDestinationProvider
as a ws-registry lookup):
de.invesdwin.context.integration.webdav.WebdavClientProperties.USERNAME=invesdwin
de.invesdwin.context.integration.webdav.WebdavClientProperties.PASSWORD=invesdwin
@WebserverTest
when using invesdwin-context-webserver
to enable the server in your unit tests (the context path is <WEBSERVER_BIND_URI>/webdav/
of your embedded web server). The following system properties are available:
# set to clean the server directory regularly of old files, keep empty or unset to disable this feature
de.invesdwin.context.integration.webdav.server.WebdavServerProperties.PURGE_FILES_OLDER_THAN_DURATION=1 DAYS
ProvidedMpiAdapter.getProvidedInstance()
automatically detects the environment (FastMPJ, MPJExpress, OpenMPI, MVAPICH2) depending on which adapter modules are on the classpath. The actual MPI classes are provided by the runtime where the jobs have been launched into and the provider will determine the correct adapter to be used. Thus it is fine to have all submodules in the classpath to make the binding decision on runtime automagically. The IMpiAdapter is an abstraction for stay flexible about which implementation to bind to.
invesdwin-context-integration-mpi-test
. There we create a Fat-Jar from the running process, invoke the FastMPJ/MPJExpress/OpenMPI/MVAPICH2 launcher to execute the packaged job on a local cluster and wait for the job to finish. For Hadoop/YARN we test with a local Docker instance and provide an example for a Jailbreak to upgrade job executions from Java 8 or 11 (supported by Hadoop as of Q1 2023) to Java 17 (required by Spring 6 and Jakarta). Quote from golang: Do not communicate by sharing memory; instead, share memory by communicating.
The abstraction in invesdwin-context-integration-channel has its origin in the invesdwin-context-persistence-timeseries module. Though it can be used for inter-network, inter-process and inter-thread communication. The idea is to use the channel implementation that best fits the task at hand (heap/off-heap, by-reference/by-value, blocking/non-blocking, bounded/unbounded, queued/direct, durable/transient, publish-subscribe/peer-to-peer, unicast/multicast, client/server, master/slave) with the transport that is most efficient or useful at that spot. To switch to a different transport, simply replace the channel implementation without having to modify the code that uses the channel. Best effort was made to keep the implementations zero-copy and zero-allocation (where possible). If you find ways to further improve the speed, let us know!
ATimeSeriesUpdater
. Though this creates the problem of how to build an IPC (Inter-Process-Communication) mechanism that does not slow down the data access too much. As a solution, this module provides a unidirectional channel (you should use one channel for requests and a separate one for responses on a peer to peer basis) for blazing fast IPC on a single computer with the following tools:
PipeSynchronousReader
and PipeSynchronousWriter
provide implementations for classic FIFO/Named Pipes
SynchronousChannels.createNamedPipe(...)
creates pipes files with the command line tool mkfifo
/mknod
on Linux/MacOSX, returns false when it failed to create one. On linux you have to make sure to open each pipe with a reader/writer in the correct order on both ends or else you will get a deadlock because it blocks. The correct way to do this is to first initialize the request channel on both ends, then the response channel (or vice versa).SynchronousChannels.isNamedPipeSupported(...)
tells you if those are supported so you can fallback to a different implementation e.g. on Windows (though you could write your own mkfifo.exe/mknod.exe implementation and put it into PATH to make this work with Windows named pipes. At least currently there is no command line tool directly available for this on Windows, despite makepipe which might be bundled with some versions of SQL Server, but we did not try to implement this because there is a better alternative to named pipes below that works on any operating system)MappedSynchronousReader
and MappedSynchronousWriter
provide channel implementations that are based on memory mapped files (inspired by Chronicle-Queue and MappedBus while making it even simpler for maximum speed). This implementation has the benefit of being supported on all operating systems. It has less risk of accidentally blocking when things go really wrong because it is non-blocking by nature. And this implementation is slightly faster than Named Pipes for the workloads we have tested, but your mileage might vary and you should test it yourself. Anyway you should still keep communication to a minimum, thus making chunks large enough during iteration to reduce synchronization overhead between the processes, for this a bit of fine tuning on your communication layer might be required.
SynchronousChannels.getTmpfsFolderOrFallback()
gives the /dev/shm
folder when available or the temp directory as a fallback for e.g. Windows. Using tmpfs could improve the throughput a bit, since the OS will then not even try to flush the memory mapped file to disk.NativeSocketSynchronousReader
and NativeSocketSynchronousWriter
provide implementations for TCP/IP socket communication for when you actually have to go across computers on a network. If you want to use UDP/IP you can use the classes NativeDatagramSynchronousReader
and NativeDatagramSynchronousWriter
, but beware that packets can get lost when using UDP, so you need to implement retry behaviours on your client/server. It is not recommended to use these implementations for IPC on localhost, since they are quite slow in comparison to other implementations. There are also Netty variants of these channels so you can use Epoll, KQueue and IOUring if you want. The goal of these implementations is to use as little Java code as possible for lowering latency with zero garbage and zero copy. They are suitable to be accelerated using special network cards that support kernel bypass with OpenOnload or its potentially faster alternatives. This might lead to an additional 60%-80% latency reduction. Aeron could also be used with the native MediaDriver together with kernel bypass for reliable low latency UDP transport. The Java process talks to the MediaDriver via shared memory, which removes JNI overhead (at the cost of thread context switching. Specialized high performance computing (HPC) hardware (buzzwords like Infiniband, RMA, RDMA, PGAS, Gemini/Aries, Myrinet, NVLink, iWarp, RoCE, OpenFabrics, ...) can be intergrated with channels for OpenMPI, MpjExpress, JUCX and DISNI. IPoIB provides a kernel driver for sockets over Infiniband, but this is slower due to requiring syscalls like normal sockets. Kernel bypass latency reductions for Infiniband can only be achieved by our specialized transport integrations. For server loads it might be better to use the standard asynchronous handler threading model of Netty via NettySocketAsynchronousChannel
or NettyDatagramAsynchronousChannel
using an IAsynchronousHandler
. This should scale to thousands of connections with high throughput. The SynchronousChannels are designed for high performance peer-to-peer connections, but they are also non-blocking and multiplexing capable. This should provide similar scalability to Netty without restrictions on the underlying transport implementation. Our unified channel abstraction makes higher level code (like protocol and topology) fully reusable.QueueSynchronousReader
and QueueSynchronousWriter
provide implementations for non-blocking java.util.Queue
usage for inter-thread communication inside the JVM. The classes BlockingQueueSynchronousReader
and BlockingQueueSynchronousWriter
provide an alternative for java.util.concurrent.BlockingQueue
implementations like SynchronousQueue
which do not support non-blocking usage. Be aware that not all queue implementations are thread safe, so you might want to synchronize the channels via SynchronousChannels.synchronize(...)
. The various Disruptor implementations (LMAX, Conversant) might work better well for high concurrency loads, though in our tests AgronaOneToOneQueue/AgronaManyToOneQueue/AgronaManyToManyQueue
seem to perfom best for inter-thread communication.ASpinWait
. It first spins when things are rolling and falls back to a fast sleep interval when communication has cooled down (similar to what java.util.concurrent.SynchronousQueue
does between threads). The actual timings can be fully customized. To keep CPU usage to a minimum, spins can be disabled entirely in favour of waits/sleeps. To use this class, just override the isConditionFulfilled()
method by calling reader.hasNext()
.Old Benchmarks (2016, Core i7-4790K with SSD, Java 8):
DatagramSocket (loopback) Records: 111.01/ms in 90.078s => ~60% slower than Named Pipes
ArrayDeque (synced) Records: 127.26/ms in 78.579s => ~50% slower than Named Pipes
Named Pipes Records: 281.15/ms in 35.568s => using this as baseline
SynchronousQueue Records: 924.90/ms in 10.812s => ~3 times as fast as Named Pipes
LinkedBlockingQueue Records: 1,988.47/ms in 5.029s => ~7 times as fast as than Named Pipes
Mapped Memory Records: 3,214.40/ms in 3.111s => ~11 times as fast as than Named Pipes
Mapped Memory (tmpfs) Records: 4,237.29/ms in 2.360s => ~15 times as fast as than Named Pipes
New Benchmarks (2021, Core i9-9900k with SSD, Java 17, mitigations=off; best in class marked by *
):
Network NettyDatagramOio (loopback) Records: 1.00/s => ~99.99% slower (threading model unsuitable)
Network NettySocketOio (loopback) Records: 1.01/s => ~99.99% slower (threading model unsuitable)
Network NettyUdt (loopback) Records: 51.83/s => ~99.95% slower
Network JNetRobust (loopback) Records: 480.42/s => ~99.54% slower
Network BarchartUdt (loopback) Records: 808.25/s => ~99.24% slower
Network AsyncNettyUdt (loopback) Records: 859.93/s => ~99.19% slower
Network BidiNettyUdt (loopback) Records: 5.27/ms => ~95% slower
Network NngTcp (loopback) Records: 13.33/ms => ~87% slower
Network BidiMinaSctpApr (loopback) Records: 27.43/ms => ~74% slower
Thread NngInproc Records: 28.93/ms => ~73% slower
Network NettySocketIOUring (loopback) Records: 30.49/ms => ~71% slower
Network MinaSocketNio (loopback) Records: 32.41/ms => ~70% slower
Network BidiMinaSocketApr (loopback) Records: 32.96/ms => ~69% slower
Network NettySocketNio (loopback) Records: 34.74/ms => ~67% slower
Network CzmqTcp (loopback) Records: 35.24/ms => ~67% slower
Network JnanomsgTcp (loopback) Records: 35.90/ms => ~66% slower
Network AsyncMinaSctpApr (loopback) Records: 37.99/ms => ~64% slower (using async handlers for servers)
Network BidiMinaSocketNio (loopback) Records: 38.21/ms => ~64% slower
Network BidiBarchartUdt (loopback) Records: 38.40/ms => ~64% slower
Network JzmqTcp (loopback) Records: 39.37/ms => ~63% slower
Network NettyDatagramIOUring (loopback) Records: 40.24/ms => ~62% slower
Process JeromqIpc Records: 41.13/ms => ~61% slower
Network BidiNettyDatagramIOUring (loopback) Records: 41.96/ms => ~61% slower
Network BidiNettySocketNio (loopback) Records: 43.18/ms => ~59% slower
Network JeromqTcp (loopback) Records: 43.88/ms => ~59% slower
Process CzmqIpc Records: 45.13/ms => ~58% slower
Network FastMPJ (loopback) Records: 45.61/ms => ~57% slower
Network BidiNettyDatagramNio (loopback) Records: 47.84/ms => ~55% slower
Network NettyDatagramNio (loopback) Records: 49.04/ms => ~54% slower
Network MPJExpress (loopback) Records: 49.45/ms => ~54% slower
Network AsyncMinaSocketNio (loopback) Records: 53.45/ms => ~50% slower (using async handlers for servers)
Network AsyncNettySocketIOUring (loopback) Records: 54.06/ms => ~49% slower (using async handlers for servers)
Network BidiNettySocketHadroNio (loopback) Records: 55.29/ms => ~48% slower
Process JzmqIpc Records: 55.35/ms => ~48% slower
Network BidiMinaDatagramNio (loopback) Records: 55.51/ms => ~48% slower (unreliable)
Network AsyncNettySocketOio (loopback) Records: 58.76/ms => ~45% slower (using async handlers for servers)
Network AsyncNettyDatagramIOUring (loopback) Records: 59.29/ms => ~44% slower (unreliable; using async handlers for servers)
Network NettySocketHadroNio (loopback) Records: 59.44/ms => ~44% slower
Process JnanomsgIpc Records: 61.36/ms => ~42% slower
Network KryonetTcp (loopback) Records: 62.71/ms => ~41% slower
Network HadroNioSocketChannel (loopback) Records: 65.63/ms => ~38% slower
Network AsyncNettySocketNio (loopback) Records: 65.86/ms => ~38% faster (using async handlers for servers)
Network BidiSctpChannel (loopback) Records: 66.23/ms => ~38% slower
Network BidiNativeSctpChannel (loopback) Records: 70.97/ms => ~33% slower
Network BidiHadroNioSocketChannel (loopback) Records: 71.77/ms => ~33% slower
Network AsyncMinaDatagramNio (loopback) Records: 73.88/ms => ~31% slower (unreliable; using async handlers for servers)
Network AsyncNettyDatagramOio (loopback) Records: 74.31/ms => ~30% slower (unreliable; using async handlers for servers)
Network AsyncNettySocketHadroNio (loopback) Records: 74.81/ms => ~30% slower (using async handlers for servers)
Network AsyncNettyDatagramNio (loopback) Records: 75.58/ms => ~29% slower (unreliable; using async handlers for servers)
Network BlockingBidiDisniActive (SoftiWarp) Records: 78.47/ms => ~26% slower
Network SctpChannel (loopback) Records: 81.78/ms => ~23% slower
Network BlockingDisniActive (SoftiWarp) Records: 84.22/ms => ~21% slower
Network BlockingDisniActive (SoftRoCE) Records: 84.76/ms => ~20% slower (unreliable, crashes after ~1 million records)
Network BidiNettySocketHadroNio (SoftRoCE) Records: 84.88/ms => ~20% slower (unreliable)
Network NativeSctpChannel (loopback) Records: 86.52/ms => ~19% slower
Network BidiNativeMinaSctpApr (loopback) Records: 87.81/ms => ~18% slower
Network KryonetUdp (loopback) Records: 87.85/ms => ~18% slower
Network BlockingBidiDisniActive (SoftRoCE) Records: 88.53/ms => ~17% slower (unreliable, crashes after ~1 million records)
Network NettySocketEpoll (loopback) Records: 91.64/ms => ~14% slower
Network NettySocketHadroNio (SoftRoCE) Records: 95.12/ms => ~11% slower (unreliable, crashes after ~500k records)
Network JucxTag (loopback, PEH) Records: 97.50/ms => ~8% slower (PeerErrorHandling, supports Infiniband and other specialized hardware)
Network BidiJucxTag (loopback, PEH) Records: 98.78/ms => ~7% slower (PeerErrorHandling)
Network JucxStream (loopback, PEH) Records: 99.88/ms => ~6% slower (PeerErrorHandling)
Network BidiJucxStream (loopback, PEH) Records: 101.25/ms => ~5% slower (PeerErrorHandling)
Network BidiHadroNioSocketChannel (SoftRoCE) Records: 104.23/ms => ~2% slower (unreliable)
Network HadroNioSocketChannel (SoftRoCE) Records: 105.47/ms => ~1% slower (unreliable)
Network BlockingNioSocket (loopback) Records: 106.51/ms => using this as baseline
Network AsyncNettySocketHadroNio (SoftRoCE) Records: 109.32/ms => ~3% faster (unreliable)
Network BlockingHadroNioSocket (loopback) Records: 112.12/ms => ~5% faster
Network BidiNettySocketEpoll (loopback) Records: 114.18/ms => ~7% faster
Network BidiNettyDatagramEpoll (loopback) Records: 114.47/ms => ~7% faster (unreliable)
Network EnxioSocketChannel (loopback) Records: 115.72/ms => ~9% faster
Thread JnanomsgInproc Records: 117.81/ms => ~11% faster
Network NettyDatagramEpoll (loopback) Records: 119.89/ms => ~13% faster (unreliable)
Network BlockingNioDatagramSocket (loopback) Records: 124.49/ms => ~17% faster (unreliable)
Network BidiBlockingNioDatagramSocket (loopback) Records: 127.86/ms => ~20% faster (unreliable)
Thread CzmqInproc Records: 132.52/ms => ~24% faster
Network NioSocketChannel (loopback) Records: 136.40/ms => ~28% faster
Network ChronicleNetwork (loopback) Records: 136.54/ms => ~28% faster (using UnsafeFastJ8SocketChannel)
Network NativeNettySocketEpoll (loopback) Records: 137.21/ms => ~29% faster
Network JucxStream (SoftRoCE, PEH) Records: 137.22/ms => ~29% faster (PeerErrorHandling, unreliable, crashes after ~5 million records)
Network NativeSocketChannel (loopback) Records: 138.75/ms => ~30% faster (faster than NativeNetty because no JNI?)
Network AsyncNettyDatagramEpoll (loopback) Records: 139.22/ms => ~31% faster (unreliable; using async handlers for servers)
Network BidiDisniActive (SoftiWarp) Records: 146.68/ms => ~38% faster
Network BidiDisniPassive (SoftiWarp) Records: 150.68/ms => ~41% faster
Network AsyncNettySocketEpoll (loopback) Records: 153.22/ms => ~44% faster (using async handlers for servers)
Network BidiBlockingHadroNioSocket (loopback) Records: 155.16/ms => ~46% faster
Network BidiBlockingNioSocket (loopback) Records: 158.88/ms => ~49% faster
Process ChronicleQueue Records: 164.71/ms => ~55% faster
Process ChronicleQueue (tmpfs) Records: 164.79/ms => ~55% faster
Network DisniActive (SoftiWarp) Records: 182.08/ms => ~71% faster
Network DisniPassive (SoftiWarp) Records: 186.40/ms => ~75% faster
Network BidiJucxStream (SoftRoCE, PEH) Records: 189.48/ms => ~78% faster (PeerErrorHandling, unreliable, crashes after ~5 million records)
Network BidiNativeMinaSocketApr (loopback) Records: 200.83/ms => ~88% faster
Network BidiJucxTag (SoftRoCE, PEH) Records: 203.85/ms => ~91% faster (PeerErrorHandling, unreliable, crashes after ~5 million records)
Network BidiEnxioSocketChannel (loopback) Records: 205.65/ms => ~93% faster
Network BidiNioSocketChannel (loopback) Records: 208.31/ms => ~95% faster
Network BidiChronicleNetwork (loopback) Records: 212.36/ms => ~99% faster (using UnsafeFastJ8SocketChannel)
Network* BidiNativeSocketChannel (loopback) Records: 212.72/ms => ~99% faster
Network JucxTag (SoftRoCE, PEH) Records: 214.42/ms => ~2 times as fast (PeerErrorHandling, unreliable, crashes after ~5 million records)
Network* AeronUDP (loopback) Records: 216.57/ms => ~2 times as fast (reliable and supports multicast)
Network BidiDisniActive (SoftRoCE) Records: 232.49/ms => ~2.2 times as fast (unreliable, crashes after ~1 million records)
Network DisniActive (SoftRoCE) Records: 243.85/ms => ~2.3 times as fast (unreliable, crashes after ~1 million records)
Network BidiDisniPassive (SoftRoCE) Records: 245.15/ms => ~2.3 times as fast (unreliable, crashes after ~1 million records)
Network BidiNativeNettyDatagramEpoll (loopback) Records: 259.74/ms => ~2.4 times as fast (unreliable)
Network NativeNettyDatagramEpoll (loopback) Records: 263.63/ms => ~2.5 times as fast (unreliable)
Network BidiNioDatagramChannel (loopback) Records: 277.80/ms => ~2.6 times as fast (unreliable)
Network NioDatagramChannel (loopback) Records: 286.81/ms => ~2.7 times as fast (unreliable)
Network BidiNativeDatagramChannel (loopback) Records: 294.35/ms => ~2.8 times as fast (unreliable; faster than NativeNetty because no JNI?)
Thread BidiMinaVmPipe Records: 298.17/ms => ~2.8 times as fast
Network MinaNativeDatagramChannel (loopback) Records: 306.79/ms => ~2.9 times as fast (unreliable)
Network NativeDatagramChannel (loopback) Records: 308.10/ms => ~2.9 times as fast (unreliable; faster than NativeNetty because no JNI?)
Network DisniPassive (SoftRoCE) Records: 320.18/ms => ~3 times as fast (unreliable, crashes after ~1 million records)
Thread JeromqInproc Records: 323.88/ms => ~3 times as fast
Process NamedPipe (Streaming) Records: 351.87/ms => ~3.3 times as fast
Thread MinaVmPipe Records: 372.10/ms => ~3.5 times as fast
Process UnixDomainSocket Records: 429.44/ms => ~4 times as fast (requires Java 16 for NIO support)
Process NativeUnixDomainSocket Records: 433.28/ms => ~4.1 times as fast (requires Java 16 for NIO support)
Process BidiUnixDomainSocket Records: 451.21/ms => ~4.2 times as fast (requires Java 16 for NIO support)
Process BidiNativeUnixDomainSocket Records: 460.59/ms => ~4.3 times as fast (requires Java 16 for NIO support)
Process NamedPipe (FileChannel) Records: 528.44/ms => ~5 times as fast
Network BidiJucxTag (noPEH) Records: 536.55/ms => ~5 times as fast (NoPeerErrorHandling, most likely using shared memory)
Network JucxStream (noPEH) Records: 538.19/ms => ~5 times as fast (NoPeerErrorHandling, most likely using shared memory)
Network JucxTag (noPEH) Records: 539.70/ms => ~5.1 times as fast (NoPeerErrorHandling, most likely using shared memory)
Process MPJExpress (shared memory) Records: 550.26/ms => ~5.2 times as fast
Network BidiJucxStream (noPEH) Records: 553.75/ms => ~5.2 times as fast (NoPeerErrorHandling, most likely using shared memory)
Process NamedPipe (Native) Records: 603.46/ms => ~5.7 times as fast
Thread LockedReference Records: 931.74/ms => ~8.7 times as fast
Process OpenMPI (shared memory) Records: 1,024.63/ms => ~9.6 times as fast (supports Infiniband and other specialized hardware)
Process Jocket Records: 1,204.82/ms => ~11.3 times as fast (unstable; deadlocks after 2-3 million messages; their tests show ~1792.11/ms which would be ~16.8 times faster; had to test on Java 8 )
Thread LinkedBlockingDeque Records: 1,520.45/ms => ~14.3 times as fast
Thread ArrayBlockingQueue Records: 1,535.72/ms => ~14.4 times as fast
Thread LinkedBlockingQueue Records: 1,716.12/ms => ~16.1 times as fast
Thread ArrayDeque (synced) Records: 1,754.14/ms => ~16.4 times as fast
Thread SynchronousQueue Records: 2,207.46/ms => ~20.7 times as fast
Thread SynchronizedReference Records: 2,240.90/ms => ~21 times as fast
Thread LmaxDisruptorQueue Records: 2,710.39/ms => ~25.4 times as fast
Thread LinkedTransferQueue Records: 2,812.62/ms => ~26.4 times as fast
Thread ConcurrentLinkedDeque Records: 2,844.95/ms => ~26.7 times as fast
Process Blocking Mapped Memory Records: 3,140.80/ms => ~29.5 times as fast
Process Blocking Mapped Memory (tmpfs) Records: 3,229.97/ms => ~30.3 times as fast
Thread LmaxDisruptor Records: 3,250.13/ms => ~30.5 times as fast
Thread ConversantDisruptorConcurrent Records: 3,389.03/ms => ~31.8 times as fast
Thread ConversantDisruptorBlocking Records: 3,498.95/ms => ~32.85 times as fast
Thread AgronaManyToOneRingBuffer Records: 3,510.74/ms => ~32.96 times as fast
Thread ConversantPushPullBlocking Records: 3,628.18/ms => ~34 times as fast
Thread JctoolsSpscLinkedAtomic Records: 3,762.94/ms => ~35.3 times as fast
Thread ConversantPushPullConcurrent Records: 3,777.29/ms => ~35.5 times as fast
Thread JctoolsSpscLinked Records: 3,885.46/ms => ~36.5 times as fast
Thread AtomicReference Records: 3,964.79/ms => ~37.2 times as fast
Thread NettySpscQueue Records: 3,995.21/ms => ~37.5 times as fast
Thread NettyMpscQueue Records: 4,086.30/ms => ~38.4 times as fast
Thread NettyFixedMpscQueue Records: 4,141.13/ms => ~38.88 times as fast
Thread AgronaManyToOneQueue Records: 4,178.33/ms => ~39.2 times as fast
Process AeronIPC Records: 4,209.11/ms => ~39.5 times as fast
Thread AgronaBroadcast Records: 4,220.66/ms => ~39.6 times as fast
Thread AgronaManyToManyQueue Records: 4,235.49/ms => ~39.8 times as fast
Thread JctoolsSpscArray Records: 4,337.64/ms => ~40.7 times as fast
Thread JctoolsSpscAtomicArray Records: 4,368.72/ms => ~41 times as fast
Thread* AgronaOneToOneQueue Records: 4,466.68/ms => ~42 times as fast (works with plain objects)
Thread VolatileReference Records: 4,733.50/ms => ~44.4 times as fast
Thread AgronaOneToOneRingBuffer (SafeCopy) Records: 4,831.85/ms => ~45.3 times as fast
Thread AgronaOneToOneRingBuffer (ZeroCopy) Records: 4,893.33/ms => ~45.9 times as fast
Process Mapped Memory Records: 6,521.46/ms => ~61.2 times as fast
Process* Mapped Memory (tmpfs) Records: 6,711.41/ms => ~63 times as fast
StreamEncryptionChannelFactory
for e.g. AES) and verification (StreamVerifiedEncryptionChannelFactory
for e.g. AES+HMAC; StreamVerifiedChannelFactory
for checksums, digests, macs, or signatures without encryption) can be added to the communication. There is also a HandshakeChannelFactory
with providers for secure key exchanges using DH, ECDH, JPake, or SRP6 to negotiate the encryption and verification channels automatically. There is also a TLS and DTLS provider using SSLEngine (JDK and Netty(-TcNative)) that can be used with any underlying transport (not only TCP or UDP). Here some benchmarks with various security providers (2022, Core i9-12900HX with SSD, Java 17, handshake not included in measured time, measuring records or round trips per millisecond, so multiply by 2 to get the messages per millisecond):JDK17 | Conscrypt | AmazonCorretto | BouncyCastle | WildflyOpenSSL | Netty-TcNative | Commons-Crypto | |
---|---|---|---|---|---|---|---|
BidiNativeSocketChannel (loopback, baseline) | 366.90/ms | - | - | - | - | - | - |
AES128CTR | 294.94/ms | 262.69/ms | 299.63/ms | 185.91/ms | - | - | - |
AES256CTR | 288.61/ms | 260.01/ms | 298.26/ms | 172.17/ms | - | - | - |
AES128CTR+HMAC256 | 223.90/ms | 87.72/ms | 208.22/ms | 109.66/ms | - | - | - |
AES256CTR+HMAC256 | 219.93/ms | 88.50/ms | 211.86/ms | 106.32/ms | - | - | - |
AES128CTR+HMAC512 | 146.79/ms | 61.19/ms | 139.74/ms | 94.58/ms | - | - | - |
AES256CTR+HMAC512 | 147.28/ms | 60.66/ms | 140.93/ms | 90.18/ms | - | - | - |
AES128+GCM | 215.47/ms | 203.45/ms | Input too short - need tag | 149.60/ms | - | - | - |
AES256+GCM | 210.28/ms | 196.47/ms | Input too short - need tag | 135.45/ms | - | - | - |
TLS3.0 | 181.08/ms | Garbled Bytes | PKIX Validation Failed | 108.52/ms | 101.24/ms | Conscrypt Garbled Bytes; OpenSSL Binding Outdated | OpenSSL Binding Outdated |
SECURITY: You can apply extended security features by utilizing the invesdwin-context-security-web module which allows you to define your own rules as described in the spring-security namespace configuration documentation. Just define your own rules in your own <http>
tag for a specific context path in your own custom spring context xml and register it for the invesdwin application bootstrap to be loaded as normally done via an IContextLocation
bean.
DEPLOYMENT: The above context paths are the defaults for the embedded web server as provided by the invesdwin-context-webserver
module. If you do not want to deploy your web services, servlets or web applications in an embedded web server, you can create distributions that repackage them as WAR files (modifications via maven-overlays or just adding your own web.xml) to deploy in a different server under a specific context path. Or you could even repackage them as an EAR file and deploy them in your application server of choice. Or if you like you could roll your own alternative embedded web server that meets your requirements. Just ask if you need help with any of this.
If you need further assistance or have some ideas for improvements and don't want to create an issue here on github, feel free to start a discussion in our invesdwin-platform mailing list.