Open shoffmeister opened 2 months ago
Thanks @shoffmeister, I believe some of the overhead is from going through the DuckDB JDBC driver. I'll try to look for perf optimizations when I get a chance.
FWIW, I just noticed incorrect reporting from me: This was a single broker, 12 partition configuration. The kwack
process was limited to consuming about 120% CPU (while the broker was bored).
I'll run this against a multi-broker cluster and see whether this would scale out to 300%-ish CPU (which would then use 3 CPUs of the available 16 CPUs)
Given a three-node Kafka cluster in KRaft mode, official Apache Kafka 3.8 "native" images (i.e. GraalVM), the performance characteristics of kwack
do not change.
On the kwack
process, CPU maxes out at 120%.
The Kafka brokers themselves are very bored.
My (virtual) box has excess physical memory left.
Screenshot from running btop
:
Partial screenshot from visualvm
:
The current implementation does
sql = "INSERT INTO '" + topic + "' VALUES (" + String.join(",", paramMarkers) + ")";
PreparedStatement stmt = stmts.computeIfAbsent(sql, s -> {
Perhaps the Appender
could be useful - see https://duckdb.org/docs/api/java.html#appender
I have created a very simple Python script which consumes my experimental topic (see above) with maximum concurrency.
Using that script, dumping (key
, value
) of those 1.5 million Kafka records into DuckDB rows wholesale as raw strings takes 17 seconds.
I would guess that performance functionally comparable to what kwack
does could go up to 25 seconds. The increase would be due to parsing value
as JSON, and pumping all record data into a fitting table structure.
Case in point: It seems as if table insert performance in DuckDB is dominated by the amount of data written. Converting to JSON and then inserting only a single field from that JSON comes in at 15 seconds, is two seconds faster than the raw string insert (the raw string has about 5100 characters).
FWIW, this is meant as a very naïve sanity check on performance potential, not meant to criticize kwack
.
The Appender functionality exposed by the JDBC driver is brutally fast.
Two challenges:
struct
et al.Final update for now, here: I think the current public wisdom on Appender is collected around https://discord.com/channels/909674491309850675/1148659944669851849/1284527414524772384
https://sourcegraph.com/github.com/duckdb/duckdb/-/blob/test/api/capi/test_capi_data_chunk.cpp is the best docs, and https://github.com/Giorgi/DuckDB.NET/blob/develop/DuckDB.NET.Data/DuckDBAppender.cs / https://github.com/Giorgi/DuckDB.NET/blob/develop/DuckDB.NET.Data/Internal/Reader/StructVectorDataReader.cs show data chunking.
Fundamentally, a bit of an unchartered territory :)
Thanks @shoffmeister . There is another issue which might impede progress, it seems that kwack tests hang when upgrading to 1.1.0. I've narrowed it down to a change in https://github.com/duckdb/duckdb-java/commit/55a5d7bc57f0a6894f8bb7b31084a74c9b42a34e (the previous commit https://github.com/duckdb/duckdb-java/commit/53fdd8396e3fbd539ee99865e4ebf912545c3d99 works fine) but have not gotten any further.
The deadlock causing the kwack tests to hang has been identified as https://github.com/duckdb/duckdb-java/issues/101
I am experimenting with a single-partition Kafka topic, on a local Kafka broker, containing synthetic test data (following a complex schema) at the scale of 1.5 million records.
I notice a large performance difference between
kwack
to ingest from the Kafka topic, given a simple JSON Schema ("kwack")followed by
Results:
read_json_auto
(at 6.6 GB RSS)Is that the "price to pay" for native Kafka interconnect?