tuor713 / trino-kdb

Trino plugin for kdb+
Apache License 2.0
3 stars 2 forks source link
kdb trino-plugin

Trino Plugin for kdb+

A Trino plugin for kdb+ - currently in beta state. Feedback (bugs, feature requests, questions) welcome!

Sample catalog definition:

connector.name=kdb
kdb.host=localhost
kdb.port=8000

Features

The plugin currently supports:

Fine Print

Configuration Options

Settings that can be used in catalog file:

Config Description
kdb.host Hostname of KDB server
kdb.port Port of KDB server
kdb.user (Optional) User for authenticating with KDB server
kdb.password (Optional) Password for authenticating with KDB server
page.size (Optional) Size of pages (in number of rows) retrieved from KDB (default: 50,000)
use.stats (Optional) Support stats for KDB either pre-generated or calculated on the fly (see dynamic.stats) (default: true)
dynamic.stats (Optional) Support on the fly stats generation. Note this can have a detrimental effect on query planning speed for large tables (default: false)
kdb.metadata.refresh.interval.seconds (Optional) Refresh interval, in seconds, for KDB metadata (default: 3600 = 1 hour)
push.down.aggregation (Optional) Enable aggregation push down (default: true)
virtual.tables (Optional) Treat all tables as virtual - not supporting features such as direct select [x] queries (default: false)
insert.function (Optional) Insert function to use to insert data into KDB tables (default: insert)
push.down.like (Optional, experimental) Push down like filters (default: false)
kdb.extra.credential.user (Optional) Extra credential key for session level credentials: user
kdb.extra.credential.user (Optional) Extra credential key for session level credentials: password

Session Property overrides

Property Default
push_down_aggregation Session override for catalog property push.down.aggregation
use_stats Session override for catalog property use.stats
dynamic_stats Session override for catalog property dynamic.stats
page_size Session override for catalog property page.size
virtual_tables Session override for catalog property virtual.tables
insert_function Session override for catalog property insert.function
push_down_like Session override for catalog property push.down.like

Pre-Generated Stats

Stats generation is still quite raw and can take extremely long for partitioned tables. As an alternatives stats can be precomputed and stored in two tables:

.trino.stats:([table: `symbol$()] rowcount: `long$())

.trino.colstats:([table: `symbol$(); column: `symbol$()] 
  distinct_count: `long$(); 
  null_fraction: `double$();
  size: `long$()`; 
  min_value: `double$();
  max_value: `double$())

Building

This library depends on javakdb, which can be built from GitHub. To build:

mvn package

The resulting shaded jar then needs to be dropped into ${TRINO_HOME}/plugins/kdb/.

The unit tests currently require a local instance of KDB running at port 8000.