ClickHouse / clickhouse-java

ClickHouse Java Clients & JDBC Driver
https://clickhouse.com
Apache License 2.0
1.44k stars 534 forks source link

Refactor ClickHouseStatement interface for data insertion #384

Open alex-krash opened 5 years ago

alex-krash commented 5 years ago

Goal: reduce number of public proprietary methods, exposed by ru.yandex.clickhouse.ClickHouseStatement

Current situation: interface exposes many methods for data manipulations, that differs by:

  1. Input format
  2. Additional configuration params

Proposal: create single method, returning builder for data manipulations. Design is inspired by Spark DataFrameWriter/DataFrameReader. https://spark.apache.org/docs/2.3.0/api/java/org/apache/spark/sql/DataFrameWriter.html

Example of all the options, available for configuration:

ClickHouseStatement sth;

sth
   .write()
   .withDbParams(Map<ClickHouseQueryParam, String> dbParams) // optional
   .withExternalData(List<ClickHouseExternalData> data) // optional
   .format(ClickHouseFormat.CSV)
   .input(new FileInputStream("filename")
   .table("my_table")
// or specify SQL
// .sql("INSERT INTO my_table (X,Y,Z) VALUES")
   .send(); // terminal operation, performs data insertion

For operation of binary formats, requiring callback:

sth.write().send("INSERT INTO my_table (x) VALUES ", new ClickHouseStreamCallback() {
            @Override
            public void writeTo(ClickHouseRowBinaryStream stream) throws IOException {
            }
        }, RowBinary); 

Possible shortcuts for sending the data:

sth
   .write()
   .send("INSERT INTO my_table VALUES", InputStream stream, ClickHouseFormat.CSV);

sth
   .write()
   .sendToTable("my_table", InputStream stream, ClickHouseFormat.CSV)

@den-crane, @filimonov, you are inexhaustible source of ideas - please, expose your opinion.

den-crane commented 5 years ago

I like this

   .write()
   .send("INSERT INTO my_table VALUES", InputStream stream, ClickHouseFormat.CSV);

it should support : --format_csv_delimiter=";" --query="INSERT INTO test_table_log FORMAT CSVWithNames"

  .write()
  .withDbParams((new dbParams).add(format_csv_delimiter,";" ).add ()  )
  .input(new FileInputStream("filename")
  .sql("INSERT INTO my_table (X,Y,Z) FORMAT CSVWithNames")
  .send();