Open knz opened 6 years ago
Sigh. Yeah, I guess we'll need to change this (over a couple of release cycles).
We need to support the BIT type properly ultimately. I am tempted to set up a plan to deprecate the current literal format in 2.1 so that it becomes available for BIT in 2.2.
No objection from me, although I'm interested in the migration plan.
@nvanbenschoten basically add a new syntax and a big flashy warning with the old one in 2.1; in 2.2 make the old syntax opt-in via a compat flag, and in 2.3 remove.
I think it might be nice to go through a cycle in which the b
and x
syntaxes are disallowed by default, instead of going straight from one meaning to another (to take advantage of our no-version-skip policy).
This ensures that even if users don't notice the warning they'll get a clear breakage instead of a change in behavior. We could make the new syntax available on an opt-in basis earlier than 2.3 if it's high enough priority.
I was more thinking of this:
Isn't make the old syntax opt-in
the warning we want for 2.1? That will certainly be noticed by users.
A warning can be ignored. Changing the thing to become opt-in is intrusive. I think we want a version in-between the current behavior and the one where the change is intrusive to warn users. Unless you think we can skip that step?
It seems to me that opting in (through a cluster setting, I presume) is roughly equivalent to acknowledging the warning, so it really isn't much more intrusive.
Unless you think we can skip that step?
I'll defer to Ben's judgment on that.
Ok we have a way forward which I am likely to explore for 2.1:
we support and document b'...'
(small b) for byte arrays; we happen to support B'....'
for them but that's an accident and not documented.
postgres supports and documents B'....'
(capital B) for bit literals; they do also support b'...'
for them but this is nowhere to be found in their docs.
So we can distinguish on the case of the "b" and properly fix #20991 and this one in one fell swoop.
Basically what I am proposing is to distinguish on the case of b
in 2.1 to add proper support for pg's BIT, but also deprecate the small b for byte arrays in 2.1 (to be removed fully in 2.2).
I have tried it and it works pretty well! :tada:
see #28807.
@knz after #28807 does this issue change?
So here's the current status:
Syntax | PostgreSQL | CockroachDB | Advertised feature in CockroachDB? |
---|---|---|---|
B'...' |
bit array using binary digits | bit array using binary digits | yes (since 19.1 - NB this used to have a different meaning prior to 2.1) |
X'...' |
bit array using hex digits | byte array using hex digits | yes (since 1.1) |
b'...' |
bit array using binary digits | byte array literal, literal bytes | yes (since 1.0) |
x'...' |
bit array using hex digits | byte array literal, literal bytes | no |
My opinions:
1) I am sad that we have a divergence.
2) I think it would be good to have at least one literal syntax for bit arrays using hex digits
3) that syntax should be using either x'...'
or X'...'
to have at least partial overlap with pg compat
4) a naive choice would be to say "since X'...'
is documented already but x'...'
is not, it's better to use x'...'
this way we inconvenience fewer crdb users". The choice is naive because I actually think neither X'...'
nor x'...'
is actually used.
5) Let's add telemetry to find out.
6) I'd like to point out that for consistency it would be nice to have some symmetry, I would like to say for example "Use the capitals B
and X
for bit arrays, and use the lowercase b
and x
for byte arrays." After all we're all about making data easy.
We have marked this issue as stale because it has been inactive for 18 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to CockroachDB!
format
x'aaaaaaa'
format
b'aaaaaaa'
In PostgreSQL, byte array literals use a different syntax, either:
e'ab\nc'::BYTEA
(escaped string, the same is achieved currently withb'...'
in crdb)'\xAAAA'::BYTEA
(i.e. string starting with\x
; the same is achieved currently withx'...'
in crdb)@nvanbenschoten @bdarnell what do you think? We need to support the BIT type properly ultimately. I am tempted to set up a plan to deprecate the current literal format in 2.1 so that it becomes available for BIT in 2.2. Thoughts?
Jira issue: CRDB-5687