InfluxDB is a high-performance time-series database. You need a Java driver to match.
When jamming tens-of-thousands of metrics into InfluxDB per minute, you can't afford Stop-The-World garbage collection times reaching into the hundreds (or thousands) of milliseconds. And you can't afford to senselessly burn CPU cycles.
influx4j is wickedly fast. 10 times faster than the offical driver. influx4j is a CPU miser. 10 times less CPU consumption persisting a datapoint than the official driver. influx4j generates ZERO garbage from Point to Protocol. Infinitely less garbage than the official driver.
As measured by the JMH benchmark, included in this project, comparing influx4j with the official driver, Point-to-protocol ...
Driver | Points Produced (approx.) |
Points/ms (approx.) |
Garbage Produced |
Avg Garbage Creation Rate |
G1 Garbage Collections |
---|---|---|---|---|---|
influx4j | 192 million | 4267 | zero | zero | zero |
influxdb-java | 18 million | 406 | 334.64 gb | 6.17 gb/sec | 766 |
Zero garbage means the JVM interrupts your performance critical code less.1 The extreme efficiency of the Point-to-protocol buffer serialization pipeline means you burn 10x less CPU producing the same number of points compared to the official driver.
1 Note: While influx4j generates zero garbage, your application, and associated libraries likely generate garbage that will still require collection.
PointFactory
Towards the goal of zero-garbage, influx4j employs a pooling scheme for Point
instances, such that Point
objects are recycled within the system. This pool is contained within a factory for producing Points: PointFactory
.
The first thing your application will need to do is to configure and create a PointFactory
. The only configuration options are the initial size of the pool and the maximum size of the pool.
Your application can create multiple PointFactory
instances, or a singleton one; it's up to you. All methods on the PointFactory
are thread-safe, so no additional synchronization is required.
A PointFactory
with a default initial size of 128 Point
objects and maximum size of 512 Point
objects can be constructed like so:
PointFactory pointFactory = PointFactory.builder().build();
And here is a PointFactory
created with custom configuration:
PointFactory pointFactory =
PointFactory.builder()
.initialSize(1000)
.maximumSize(8000)
.build();
The maximumSize
should be tuned to somewhat larger than the maximum number of points generated per-second by your application. That is, assuming the default connection "auto-flush" interval of one second.
The total memory consumed by the pool will be determined by the "high water mark" of usage. Keep this in mind when setting the maximumSize
. You can actually force the pool to empty by calling the flush()
method, but know that doing so will therefore create garbage out of the contents.
Point
. If the internal pool is empty, a new Point
object will be allocated.Point
is returned, that Point
will be discarded for garbage collection. Therefore, in order to avoid garbage generation, the maximum size should be set based on your application's insertion rate and the configured auto-flush rate (see below).flush()
method on the PointFactory
instance, but it is not recommended.You can obtain Points
from the PointFactory
that you simply throw away, without damaging the pool. For example, if your code may throw an exception after creating a Point
, but before persisting it, you need not worry about recycling the Point
via try-finally logic etc. Just don't make a habit of casually throwing away Points, after all, decreasing garbage is one of the goals of the library.
Point
Once you have a PointFactory
instance, you are ready to create Point
instances to persist. The Point
class implements a builder-like pattern.
Example of creating a Point
for a measurement named "consumerPoll123":
PointFactory pointFactory = ...
Point point = pointFactory
.createPoint("consumerPoll123")
.tag("fruit", "apple")
.field("yummy", true)
.field("score", 9.5d)
.timestamp();
The timestamp can also be specified explicitly:
Point point = pointFactory
.createPoint("consumerPoll123")
.tag("fruit", "banana")
.field("yummy", false)
.field("score", 5.0d)
.timestamp(submissionTS, TimeUnit.MILLISECONDS);
Note that while a TimeUnit
may be specified on the Point
, the ultimate precision of the persisted timestamp will be determined by the precision specified in the connection information (see below for details about connection parameters). The TimeUnit
specified on the Point
timestamp will automatically be converted to the precision of the connection.
Point
AccessorsPoint
contains field()
methods for the following Java types: String
, Long
, Double
, Boolean
. Tag values, as per InfluxDB specification, must be strings.
Point
also contains read accessors, such as String stringField(String field)
, but it is important to note that influx4j is optimized for write performance, and there is overhead involved in these field accessors due to linear (O(n)) scan of the relavent field type.
If the number of fields of a given type are small, the overhead will not be too great. Also, if only some fields need to be read from a Point
before insertion then you can improve performance by adding those fields first, ensuring that they will be among the first to be scanned by the linear search.
This linear scan behavior is also true of the String tag(String tagName)
accessor.
If the order of fields is always consistent, you can eliminate read-accessor overhead by using the accessors that accept an integer index, such as String stringField(int index)
. This will access the Nth String
field -- not the Nth field added to the Point
; i.e. the index is specific by field-type.
Lastly, it should be noted that the read-accessors return Objects, such as Long
, Double
, Boolean
, etc. due to the fact that the accessed field may not exist -- and therefore null
must be returned. The implication, therefore, is that an auto-boxing operation must be performed by the JVM, and the associated overhead that comes with it (incl. garbage).
Note that the Point
class in not involved in the querying of InfluxDB, so the above caveats for read-accessors only applies to points that will be written.
Point
CopyingIt is quite common to have a set of measurements which share a common set of tags, and which are produced at the same time for insertion into InfluxDB. The Point
class provides a copy()
method that make this more efficient, both in terms of execution time and code brevity.
Copying a Point
:
Point point1 = pointFactory
.createPoint("procStats")
.tag("dataCenter", "Tall Pines")
.tag("hostId", "web.223")
.field("cpuUsage", hostCpu)
.field("memTotal", hostMemTotal)
.field("memFree", hostMemFree)
.timestamp();
Point point2 = point1
.copy("netStats")
.field("inOctets", hostInOctets)
.field("outOctets", hostOutOctets)
There are several important things to note about the copy()
method:
copy()
method.point2
will also contain the "dataCenter" and "hostId" tags from point1
.Point
is copied (retained).point2
in the example above, is a Point
like any other, and therefore additional tags and fields may be added, and the timestamp changed/updated via the standard methods.An instance of InfluxDB
represents a connection to the database. Similar to the PointFactory
, a Builder
is used to configure and create an instance of InfluxDB
.
A simple example construction via the Builder
is shown here:
InfluxDB influxDB = InfluxDB.builder()
.setConnection("127.0.0.1", 8086, InfluxDB.Protocol.HTTP)
.setUsername("mueller")
.setPassword("gotcha")
.setDatabase("example")
.build();
:point_right: Note that while InfluxDB.Protocol.UDP
is defined, UDP is currently not supported by the driver.
The following configuration parameters are supported by the InfluxDB.Builder
:
:cd: setDatabase(String database)
The name of the InfluxDB database that
Point
instances will be inserted into.
:bust_in_silhouette: setUsername(String username)
The username used to authenticate to the InfluxDB server.
:key: setPassword(String password)
The password used to authenticate to the InfluxDB server.
:stopwatch: setRetentionPolicy(String retentionPolicy)
The name of the retention policy to use.
:loop: setConsistency(Consistency consistency)
The consistency setting of the connection. One of:
InfluxDB.Consistency.ALL
InfluxDB.Consistency.ANY
InfluxDB.Consistency.ONE
InfluxDB.Consistency.QUORUM
.
:clock3: setPrecision(Precision precision)
The precision of timestamps persisted through the connection. One of:
InfluxDB.Precision.NANOSECOND
InfluxDB.Precision.MICROSECOND
InfluxDB.Precision.MILLISECOND
InfluxDB.Precision.SECOND
InfluxDB.Precision.MINUTE
InfluxDB.Precision.HOUR
:toilet: setAutoFlushPeriod(long periodMs)
The auto-flush period of the connection.
Point
objects that are persisted via thewrite(Point point)
method, are not written immediately, they are queued for writing asynchronously. The auto-flush period defines how often queued points are written (flushed) to the connection. The default value is one second (1000ms), and the minimum value is 100ms.
setThreadFactory(ThreadFactory threadFactory)
An optional
ThreadFactory
used to create the auto-flush background thread.
Point
Writing a Point
is simple, there is only one method: write(Point point)
.
Point point = pointFactory.createPoint("survey")
.tag("fruit", "apple")
.field("yummy", true)
.timestamp();
influxDB.write(point);
See the InsertionTest for example usage, until I have time to write full docs.