This is an extension for Tableau Desktop / Tableau Server that simplifies the process of connecting Tableau to ClickHouse and extends support for standard Tableau functionality when working with ClickHouse (as compared to Generic ODBC/JDBC)
Requirements
clickhouse-jdbc-0.4.6-shaded.jar
to:
~/Library/Tableau/Drivers
C:\Program Files\Tableau\Drivers
clickhouse-jdbc.taco
from the Releases page, and place it to:
~/Documents/My Tableau Repository/Connectors
C:\Users\[Windows User]\Documents\My Tableau Repository\Connectors
clickhouse-jdbc-0.4.6-shaded.jar
to:
~/Library/Tableau/Drivers
C:\Program Files\Tableau\Drivers
clickhouse-jdbc.taco
from the Releases page and place it to:
~/Documents/My Tableau Prep Repository/Connectors
C:\Users\[Windows User]\Documents\My Tableau Prep Repository\Connectors
clickhouse-jdbc-0.4.6-shaded.jar
to:
/opt/tableau/tableau_driver/jdbc
C:\Program Files\Tableau\Drivers
sudo mkdir -p /opt/tableau/tableau_driver/jdbc
[/path/to/file]
with the path and [driver file name]
with the name of the driver you downloaded:
sudo cp [/path/to/file/][driver file name].jar /opt/tableau/tableau_driver/jdbc
[driver file name]
with the name of the driver you downloaded:
sudo chmod 755 /opt/tableau/tableau_driver/jdbc/[driver file name].jar
clickhouse-jdbc.taco
from the Releases page and place it into these folders on each node:
/opt/tableau/connectors
C:\Program Files\Tableau\Connectors
tsm restart
If the Set Session ID checkbox is activated on the Advanced tab (by default), feel free to set session level settings using
SET my_setting=value;
In 99% of cases you don't need the Advanced tab, for the remaining 1% you can use the following settings:
socket_timeout
is already specified, this parameter may need to be changed if some extracts are updated for a very long time. The value of this parameter is specified in milliseconds. The rest of the parameters can be found here, add them in this field separated by commascustom_http_params
parameter of the driver. For example, this is how session_id
is specified when the Set Session ID checkbox is activatedUInt256=java.lang.Double,Int256=java.lang.Double
Read more about mapping in the corresponding section
jdbcCompliance
, in this field. Be careful, the parameter values must be passed in the URL Encoded format, and in the case of passing custom_http_params
or typeMappings
in this field and in the previous fields of the Advanced tab, the values of the preceding two fields on the Advanced tab have a higher prioritysession_id
with a timestamp and a pseudo-random number in the format "tableau-jdbc-connector-{timestamp}-{number}"
By default, the driver displays fields of types UInt64, Int128, (U)Int256 as strings, but it displays, not converts. This means that when you try to write the next calculated field, you will get an error
LEFT([myUInt256], 2) // Error!
In order to work with large Integer fields as with strings, it is necessary to explicitly wrap the field in the STR() function
LEFT(STR([myUInt256]), 2) // Works well!
However, such fields are most often used to find the number of unique values (IDs as Watch ID, Visit ID in Yandex.Metrika) or as a Dimension to specify the detail of the visualization, it works well.
COUNTD([myUInt256]) // Works well too!
When using the data preview (View data) of a table with UInt64 fields, an error does not appear now.
MEDIAN_EXACT()
and PERCENTILE_EXACT()
(based on quantileExact()()).ClickHouse has a huge number of functions that can be used for data analysis — much more than Tableau supports. For the convenience of users, we have added new functions that are available for use in Live mode when creating Calculated Fields. Unfortunately, it is not possible to add descriptions to these functions in the Tableau interface, so we will add a description for them right here.
-If
Aggregation Combinator (added in v0.2.3) - allows to have Row-Level Filters right in the Aggregate Calculation. SUM_IF(), AVG_IF(), COUNT_IF(), MIN_IF() & MAX_IF()
functions have been added.BAR([my_int], [min_val_int], [max_val_int], [bar_string_length_int])
(added in v0.2.1) — Forget about boring bar charts! Use BAR()
function instead (equivalent of bar()
in ClickHouse). For example, this calculated field returns nice bars as String:
BAR([my_int], [min_val_int], [max_val_int], [bar_string_length_int]) + " " + FORMAT_READABLE_QUANTITY([my_int])
== BAR() ==
██████████████████▊ 327.06 million
█████ 88.02 million
███████████████ 259.37 million
COUNTD_UNIQ([my_field])
(added in v0.2.0) — Calculates the approximate number of different values of the argument. Equivalent of uniq(). Much faster than COUNTD().DATE_BIN('day', 10, [my_datetime_or_date])
(added in v0.2.1) — equivalent of toStartOfInterval()
in ClickHouse. Rounds down a Date or Date & Time to the given interval, for example:
== my_datetime_or_date == | == DATE_BIN('day', 10, [my_datetime_or_date]) ==
28.07.2004 06:54:50 | 21.07.2004 00:00:00
17.07.2004 14:01:56 | 11.07.2004 00:00:00
14.07.2004 07:43:00 | 11.07.2004 00:00:00
FORMAT_READABLE_QUANTITY([my_integer])
(added in v0.2.1) — Returns a rounded number with a suffix (thousand, million, billion, etc.) as a string. It is useful for reading big numbers by human. Equivalent of formatReadableQuantity()
.FORMAT_READABLE_TIMEDELTA([my_integer_timedelta_sec], [optional_max_unit])
(added in v0.2.1) — Accepts the time delta in seconds. Returns a time delta with (year, month, day, hour, minute, second) as a string. optional_max_unit
is maximum unit to show. Acceptable values: seconds
, minutes
, hours
, days
, months
, years
. Equivalent of formatReadableTimeDelta()
.GET_SETTING([my_setting_name])
(added in v0.2.1) — Returns the current value of a custom setting. Equivalent of getSetting()
.HEX([my_string])
(added in v0.2.1) — Returns a string containing the argument’s hexadecimal representation. Equivalent of hex()
.KURTOSIS([my_number])
— Computes the sample kurtosis of a sequence. Equivalent of kurtSamp()
.KURTOSISP([my_number])
— Computes the kurtosis of a sequence. The equivalent of kurtPop()
.MEDIAN_EXACT([my_number])
(added in v0.1.3) — Exactly computes the median of a numeric data sequence. Equivalent of quantileExact(0.5)(...)
.MOD([my_number_1], [my_number_2])
— Calculates the remainder after division. If arguments are floating-point numbers, they are pre-converted to integers by dropping the decimal portion. Equivalent of modulo()
.PERCENTILE_EXACT([my_number], [level_float])
(added in v0.1.3) — Exactly computes the percentile of a numeric data sequence. The recommended level range is [0.01, 0.99]. Equivalent of quantileExact()()
.PROPER([my_string])
(added in v0.2.5) - Converts a text string so the first letter of each word is capitalized and the remaining letters are in lowercase. Spaces and non-alphanumeric characters such as punctuation also act as separators. For example:
PROPER("PRODUCT name") => "Product Name"
PROPER("darcy-mae") => "Darcy-Mae"
RAND()
(added in v0.2.1) — returns integer (UInt32) number, for example 3446222955
. Equivalent of rand()
.RANDOM()
(added in v0.2.1) — unofficial RANDOM()
Tableau function, which returns float between 0 and 1.RAND_CONSTANT([optional_field])
(added in v0.2.1) — Produces a constant column with a random value. Something like {RAND()}
Fixed LOD, but faster. Equivalent of randConstant()
.REAL([my_number])
— Casts field to float (Float64). Details here
.SHA256([my_string])
(added in v0.2.1) — Calculates SHA-256 hash from a string and returns the resulting set of bytes as a string (FixedString). Convenient to use with the HEX()
function, for example, HEX(SHA256([my_string]))
. Equivalent of SHA256()
.SKEWNESS([my_number])
— Computes the sample skewness of a sequence. Equivalent of skewSamp()
.SKEWNESSP([my_number])
— Computes the skewness of a sequence. Equivalent of skewPop()
.TO_TYPE_NAME([field])
(added in v0.2.1) — Returns a string containing the ClickHouse type name of the passed argument. Equivalent of toTypeName()
.TRUNC([my_float])
— It is the same as the FLOOR([my_float])
function. Equivalent of trunc()
.UNHEX([my_string])
(added in v0.2.1) — Performs the opposite operation of HEX()
. Equivalent of unhex()
.The connector is being tested with the TDVT framework and currently maintains a 97% coverage ratio.
Originally developed by ANALYTIKA PLUS