cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.08k stars 3.8k forks source link

sql: CockroachDB's extended literal formats are not compatible with postgres #26128

Open knz opened 6 years ago

knz commented 6 years ago

In PostgreSQL, byte array literals use a different syntax, either:

@nvanbenschoten @bdarnell what do you think? We need to support the BIT type properly ultimately. I am tempted to set up a plan to deprecate the current literal format in 2.1 so that it becomes available for BIT in 2.2. Thoughts?

Jira issue: CRDB-5687

bdarnell commented 6 years ago

Sigh. Yeah, I guess we'll need to change this (over a couple of release cycles).

nvanbenschoten commented 6 years ago

We need to support the BIT type properly ultimately. I am tempted to set up a plan to deprecate the current literal format in 2.1 so that it becomes available for BIT in 2.2.

No objection from me, although I'm interested in the migration plan.

knz commented 6 years ago

@nvanbenschoten basically add a new syntax and a big flashy warning with the old one in 2.1; in 2.2 make the old syntax opt-in via a compat flag, and in 2.3 remove.

bdarnell commented 6 years ago

I think it might be nice to go through a cycle in which the b and x syntaxes are disallowed by default, instead of going straight from one meaning to another (to take advantage of our no-version-skip policy).

This ensures that even if users don't notice the warning they'll get a clear breakage instead of a change in behavior. We could make the new syntax available on an opt-in basis earlier than 2.3 if it's high enough priority.

knz commented 6 years ago

I was more thinking of this:

nvanbenschoten commented 6 years ago

Isn't make the old syntax opt-in the warning we want for 2.1? That will certainly be noticed by users.

knz commented 6 years ago

A warning can be ignored. Changing the thing to become opt-in is intrusive. I think we want a version in-between the current behavior and the one where the change is intrusive to warn users. Unless you think we can skip that step?

nvanbenschoten commented 6 years ago

It seems to me that opting in (through a cluster setting, I presume) is roughly equivalent to acknowledging the warning, so it really isn't much more intrusive.

Unless you think we can skip that step?

I'll defer to Ben's judgment on that.

knz commented 6 years ago

Ok we have a way forward which I am likely to explore for 2.1:

So we can distinguish on the case of the "b" and properly fix #20991 and this one in one fell swoop.

knz commented 6 years ago

Basically what I am proposing is to distinguish on the case of b in 2.1 to add proper support for pg's BIT, but also deprecate the small b for byte arrays in 2.1 (to be removed fully in 2.2).

knz commented 6 years ago

I have tried it and it works pretty well! :tada:

see #28807.

jordanlewis commented 5 years ago

@knz after #28807 does this issue change?

knz commented 5 years ago

So here's the current status:

Syntax PostgreSQL CockroachDB Advertised feature in CockroachDB?
B'...' bit array using binary digits bit array using binary digits yes (since 19.1 - NB this used to have a different meaning prior to 2.1)
X'...' bit array using hex digits byte array using hex digits yes (since 1.1)
b'...' bit array using binary digits byte array literal, literal bytes yes (since 1.0)
x'...' bit array using hex digits byte array literal, literal bytes no

My opinions:

1) I am sad that we have a divergence. 2) I think it would be good to have at least one literal syntax for bit arrays using hex digits 3) that syntax should be using either x'...' or X'...' to have at least partial overlap with pg compat 4) a naive choice would be to say "since X'...' is documented already but x'...' is not, it's better to use x'...' this way we inconvenience fewer crdb users". The choice is naive because I actually think neither X'...' nor x'...' is actually used. 5) Let's add telemetry to find out. 6) I'd like to point out that for consistency it would be nice to have some symmetry, I would like to say for example "Use the capitals B and X for bit arrays, and use the lowercase b and x for byte arrays." After all we're all about making data easy.

github-actions[bot] commented 1 year ago

We have marked this issue as stale because it has been inactive for 18 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to CockroachDB!