An alternative implementation of JDBC Bridge for ClickHouse.
Named Query
Besides data source, you can define named queries in configuration as well.
Named data source - config/datasources/ch.json
{
"ch": {
"type": "jdbc",
"jdbcUrl": "jdbc:clickhouse://localhost/system",
"dataSource": {
"user": "default",
"password": "",
},
"columns": [
{ "name": "instance_id", "type": "Int32", "value": "0", "nullable": false }
],
"parameters": {
"max_rows": 1000,
"fetch_size": 200
}
}
}
Note: type
, columns
and parameters
are optional.
Named query - config/queries/test-query.json
{
"test-query": {
"query": "select * from test_table",
"columns": [
{
"name": "column1",
"type": "UInt32",
"nullable": false
},
{
"name": "column2",
"type": "String",
"nullable": true
},
{
"name": "column3",
"type": "Decimal",
"nullable": true,
"precision": 10,
"scale": 2
}
],
"parameters": {
"max_rows": 10
}
}
}
Note: like named data source, paramters
is not mandatory. columns
is optional too but it's highly recommended, as it prevents runtime type inferring which could be slow.
Saved query - scripts/tests/test-query.sql
select * from test_table
With above configuration setup, you should be able to run the following queries in ClickHouse:
-- adhoc query
select * from jdbc('ch', 'select * from test_table');
-- named query
select * from jdbc('ch', 'test-query');
-- saved query
select * from jdbc('ch', 'scripts/tests/test-query.sql');
Query Parameter
You can put query parameters like max_rows
either in config or in your query.
-- get the first 10 rows(based on above configuration)
select * from jdbc('ch', 'test-query')
-- replace all null values
select * from jdbc('ch?null_as_default=true', 'test-query')
-- get a specific row
select * from jdbc('ch?max_rows=3&offset=2', 'select * from test_table order by column1 desc')
-- retrieve meta data for defining a named query
select * from jdbc('ch?debug=true', 'select * from test_table limit 1')
SRV Record Support
If you're using Consul or any other DNS server with SRV record support, you probably want to use service name instead of hostname/IP and port number combination when defining a datasource. To do that, assuming mysql.service.dc1.consul
is the service name pointing to 127.0.0.1:3306
, you can use any of below format instead of jdbc:mysql://127.0.0.1:3306/test
:
jdbc:mysql://{{ mysql.service.dc1.consul }}/test
jdbc:mysql://{{ host:mysql.service.dc1.consul }}/test
jdbc:mysql://{{ host:mysql.service.dc1.consul }}:{{ port:mysql.service.dc1.consul }}/test
Multiple Types of Data Sources
In addition to JDBC, clickhouse-datasource-bridge
is extensible to support arbitrary data sources.
-- run named query in pre-defined database
select * from jdbc('ch', 'test-query')
-- get list of jobs from Jenkins view
select * from jdbc('jenkins:https://builds.apache.org/', 'jobs')
Based on Vert.x
Eclipse Vert.x is event driven and non blocking. It also makes clickhouse-datasource-bridge
easy to config and scale.
# run with default configuration and JDBC drivers
docker run --rm -it -p 9019:9019 zhicwu/clickhouse-datasource-bridge
# run with custom configuration and JDBC drivers
docker run --rm -v `pwd`/config:/app/config -v `pwd`/drivers:/app/drivers -it -p 9019:9019 zhicwu/clickhouse-datasource-bridge
ClickHouse client version 19.11.8.46 (official build).
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 19.11.8 revision 54423.
ch-server :) select * from jdbc('ch', 'select 1')
SELECT *
FROM jdbc('ch', 'select 1')
┌─1─┐
│ 1 │
└───┘
1 rows in set. Elapsed: 0.120 sec.
ch-server :) select * from jdbc('jdbc:ch', 'select 1')
SELECT *
FROM jdbc('jdbc:ch', 'select 1')
┌─1─┐
│ 1 │
└───┘
1 rows in set. Elapsed: 0.074 sec.
ch-server :) select * from jdbc('jdbc://ch', 'select 1')
SELECT *
FROM jdbc('jdbc://ch', 'select 1')
┌─1─┐
│ 1 │
└───┘
1 rows in set. Elapsed: 0.084 sec.
ch-server :) select * from jdbc('jdbc://ch?debug=true', 'select 1')
SELECT *
FROM jdbc('jdbc://ch?debug=true', 'select 1')
┌─datasource─┬─type─┬─query────┬─parameters─────────────────────────────────────┐
│ ch │ jdbc │ select 1 │ fetch_size=1000&max_rows=0&offset=0&position=0 │
└────────────┴──────┴──────────┴────────────────────────────────────────────────┘
1 rows in set. Elapsed: 0.052 sec.
ch-server :) select * from jdbc('jdbc:clickhouse://localhost/system?user=default&password=', 'select 1')
SELECT *
FROM jdbc('jdbc:clickhouse://localhost/system?user=default&password=', 'select 1')
┌─1─┐
│ 1 │
└───┘
1 rows in set. Elapsed: 0.156 sec.
Below is the directory structure used in docker image. By default, all configuration files must be put under config
directory and JDBC drivers under drivers
.
/app
|
|--- config
| |
| |--- datasources
| | |
| | |--- test-mysql.json
| |
| |--- queries
| | |
| | |--- test-query.json
| |
| |--- httpd.json
| |--- vertx.json
| |--- server.json
|
|--- drivers
| |
| |--- some-shaded-jdbc-driver.jar
|
|--- clickhouse-datasource-bridge.jar
Usually you don't need any of them but you can surely customize as needed.
File | Reloadable (Y/N) | Description |
---|---|---|
vertx.json | N | Vertx configuration, check here for more. |
server.json | N | Server configuration. |
httpd.json | N | Http server configuration, check here for more. |
datasources/*.json | Y | Named data sources. |
queries/*.json | Y | Named queries. |
vertx.json
This configuration file will be only read during server starting and it is NOT reloadable afterwards.
{
"maxWorkerExecuteTime": 300,
"maxWorkerExecuteTimeUnit": "SECONDS",
"workerPoolSize": 10
}
server.json
This configuration file will be only read during server starting and it is NOT reloadable afterwards.
{
"serverPort": 8080,
"requestTimeout": 5000,
"queryTimeout": 60000
}
datasources/named-data-source.json
datasources/named-query.json
Issue | Workaround | Remark |
---|---|---|
query timed out | server-side: 1) increase timeout in datasource configuration; 2) increase max_execution_timeout in server.json ; client-side: 1) increase clickhouse-jdbc-driver timeout |