ClickHouse / clickhouse-java

ClickHouse Java Clients & JDBC Driver
https://clickhouse.com
Apache License 2.0
1.45k stars 537 forks source link

leading slash was added/displayed to the beginning of the field of IPv4 date type #1307

Closed patrickdung closed 1 year ago

patrickdung commented 1 year ago

Describe the bug

Backend is Clickhouse Using helm chart

customConfig:
  data_dir: /vector-data-dir
  api:
    enabled: true
    address: 127.0.0.1:8686
    playground: false
  sources:
    source_k8s_logs:
      type: kubernetes_logs
      pod_annotation_fields:
        pod_ip: kubernetes_pod_ip
        pod_ips: kubernetes_pod_ips
        pod_name: kubernetes_pod_name
        pod_namespace: kubernetes_pod_namespace
        pod_node_name: kubernetes_pod_node_name
 sinks:
   vector_to_clickhouse:
      type: clickhouse
      inputs:
        - source_k8s_logs
      endpoint: http://ch1.ch.svc.cluster.local:8123
      healthcheck:
        enabled: false
      database: logdb
      table: vector_k8s_logs_v2
      auth:
        strategy: basic
        user: logdb
        password: removed
      skip_unknown_fields: true
      buffer:
        max_events: 5000
      batch:
        timeout_secs: 40
      encoding:
        timestamp_format: rfc3339

Here's CH schema

CREATE TABLE IF NOT EXISTS
  logdb.vector_k8s_logs
(
   `timestamp` String,
   `parsed_timestamp` DateTime64(9) DEFAULT
      parseDateTime64BestEffort(timestamp, 9) CODEC(Delta(2), ZSTD(1)),
   `file` String,
   `kubernetes_pod_ip` IPv4,
   `kubernetes_pod_ips` Array(IPv4),
   `kubernetes_pod_labels` Map(LowCardinality(String), String),
   `kubernetes_pod_name` LowCardinality(String),
   `kubernetes_pod_namespace` LowCardinality(String),
   `kubernetes_pod_node_name` LowCardinality(String),
   `message` String,
   `source_type` LowCardinality(String),
   `stream` Enum('stdout', 'stderr')
)
ENGINE MergeTree
PARTITION BY toDate(parsed_timestamp)
ORDER BY parsed_timestamp;

The record should be fine when I use CH cli to view it. When I use DBeaver via JDBC http, for kubernetes_pod_ip and kubernetes_pod_ips, there is a leading slash. image

If I use SQLworkBench, the leading slash exists for kubernetes_pod_ips (Array of IPv4).. image

For CH, if I use String data type instead of IPv4, it are not leading slash.

Steps to reproduce

  1. insert data from vector to CH
  2. view the data via CH JDBC driver

Expected behaviour

No leading slash is added

Code example

See above

Error log

Configuration

Environment

ClickHouse server

zhicwu commented 1 year ago

Thanks for the report @patrickdung.

In JDBC driver, IPv4 is mapped to java.net.Inet4Address and similarly java.net.Inet6Address for IPv6. Database client usually don't recognize them, hence toString() was called for rendering and that's where the leading slash came from. If you prefer String data type, you may change your query to something like select toIPv4('127.0.0.1')::String, toIPv6('::1')::String.