StarRocks / starrocks

StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries.
https://starrocks.io
Apache License 2.0
8.66k stars 1.75k forks source link

When using the jdbc catalog to query a column containing Chinese, the column length was incorrectly calculated. #46625

Open DeH40 opened 3 months ago

DeH40 commented 3 months ago

Steps to reproduce the behavior (Required)

  1. CREATE MYSQL TABLE
    CREATE TABLE `mysql_table_for_test` (
    `varchar_column` varchar(20) DEFAULT NULL
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4

    2.INSERT DATA INTO MySQL TABLE

    INSERT INTO test.mysql_table_for_test (varchar_column) VALUES ('这里只有几个汉字')

    3.CREATE JDBC CATALOG

    CREATE EXTERNAL CATALOG jdbc_catalog_test1
    PROPERTIES
    (
    "type"="jdbc",
    "user"="user",
    "password"="password",
    "jdbc_uri"="jdbc:mysql://host:port",
    "driver_url"="http://host:port/mysql-connector-java-8.0.28.jar",
    "driver_class"="com.mysql.cj.jdbc.Driver"
    );
  2. QUERY VIA JDBC CATALOG
    SELECT varchar_column from jdbc_catalog_test1.test.mysql_table_for_test;

    Expected behavior (Required)

    return data

    Real behavior (Required)

    return exception: [42000][1064] Value length exceeds limit on column[varchar_column], max length is [20], value is [这里只有几个汉字]

    StarRocks version (Required)

    3.1.2-4f3a2ee

ShaoxunLi commented 2 months ago

The length of varchar in starrocks is in bytes, but in MySQL it is in characters.

DeH40 commented 2 months ago

The length of varchar in starrocks is in bytes, but in MySQL it is in characters.

Got it, are you planning to fix the problem caused by the difference in varchar length calculation?