yugabyte / yugabyte-db

YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
https://www.yugabyte.com
Other
8.93k stars 1.06k forks source link

[YCQL] Compatibility mode to implicitly CAST JSONB data (0x0080) to TEXT for non-Yugabyte-aware Cassandra clients #19712

Open nodakai opened 11 months ago

nodakai commented 11 months ago

Jira Link: DB-8517

Description

Most existing Cassandra clients will close connections when they encounter the 0x0080 type code for JSONB. While it's possible for queries to CAST the column AS TEXT, numerous user-friendly UI applications don't do this automatically in their quick table browsing view.

Proposal: we can add a new flag --ycql_treat_jsonb_as_text to TServer which enables the compatibility scheme below:

For YCQL-aware clients to opt-in to enable the "JSONB as JSONB" mode (status quo, disabling the compatibility scheme above), we can probably utilize OPTIONS / SUPPORTED / STARTUP messages:

  1. YCQL-aware clients should make sure to send OPTIONS
  2. YCQL TServer with --ycql_pretend_jsonb_as_text should include JSONB_COLUMNS=1 in the SUPPORTED message sent back as a reply
    1. If so, the client should add a non-standard option ENABLE_JSONB_COLUMNS=1 to the STARTUP message
    2. Then querying JSONB columns return data with 0x0080

This scheme can be extended to support more sophisticated encoding of JSONB like MessagePack in the future

Issue Type

kind/new-feature

Warning: Please confirm that this issue does not contain any sensitive information

ddorian commented 11 months ago

Hi @nodakai

What tools did you have a problem with?

nodakai commented 11 months ago

Jetbrains IDEs, for example (this article https://www.jetbrains.com/help/datagrip/apache-cassandra.html is about DataGrip which is tailored for database management but other IDEs of theirs have equal capability in this context). They appear to use a JDBC adapter https://github.com/DataGrip/cassandra-jdbc-driver built on top of (an older version of) the official Cassandra Java driver which doesn't implement JDBC.

(There's a known issue with the keyspace metadata incompatibility https://docs.yugabyte.com/preview/troubleshoot/nodes/trouble-common/#servererror-server-error-unknown-keyspace-cf-pair-system-peers-v2 So even my proposed workaround won't suffice for enhancing YCQL compatibility with JetBrains IDEs.)

I'm also considering these alternative GUIs https://dbschema.com/jdbc-driver/cassandra.html https://gist.github.com/attila123/d52d521a0f25fc912ec86a77c7c0a30d utilizing an independent OSS Cassandra JDBC driver developed by the DbSchema team.

Additionally, I'm exploring the integration of Trino with YCQL https://trino.io/docs/current/connector/cassandra.html Swapping Trino's default Cassandra driver for the YCQL-aware driver by Yugabyte seems like a complex endeavor (I haven't tried it yet). In this case, users could, in theory, be taught to adjust their queries to return texts (using ->> in stead of -> at the last level), but this workaround isn't user-friendly, especially for data scientists during the exploratory phase of their work. Unlike with PostgreSQL, forgetting to use the workaround could lead to a connection drop.

ddorian commented 11 months ago

I agree for GUI apps.

Additionally, I'm exploring the integration of Trino with YCQL

But for tools like Trino, you want to use our driver because it's cluster-aware (knows where the tablets are located) to make efficient queries & inserts.