Open knz opened 6 years ago
@awoods187 I'd like to work to address this over the summer (next milestone or the one after that). This would solve a large class of compatibility problems.
/cc @andy-kimball
bit
has incorrectly a maximum size (and its underlying implementation is currently incorrect) planned in the 2.2 time frame, but may be considered as bugfix and backported in 2.1 (with doc known-limitation at the 2.1 release)Here's a micro-RFC that I'd like to get feedback on before starting on the implementation.
Updated 2018-12-05
We currently assume that INT
really means INT8
, however this is inconsistent with PostgreSQL and various (Java) ORMs that assume INT
means INT4
. We want a transition path such that INT
means INT4
in CockroachDB.
INT --> INT8
on 2.1 will work correctly on 2.2 and also on 2.Next with only a flag change.INT --> INTx
mapping as close to parse time as possible to avoid needing to plumb type-mapping logic down into all of the various ColumnType
support functions or schema introspection tables.INT
values to minimize cognitive load and change-sprawl.default_int_size
whose valid values are 8 (the default) and 4.INT
or INTEGER
reflects the default_int_size
setting by substituting INT4
or INT8
as appropriate.
unique_rowid()
, it should be an INT8
.TableDescriptors
to ensure that any SemanticType=INT
columns with unspecified Width
are reset to Width=64
.~
INT := INT4
will simply have INT4 in the parse tree. Other SQL not coming in from user input will continue to use INT := INT8
to preserve the historical behavior.pg_attribute
should return correct atttypid
values, instead of always returning 20 (INT8
) for all int types.information_schema.columns
to remove character_maximum_length
for int types, which pg leaves unpopulated. The docs on ColumnType.MaxCharacterLength()
are inconsistent with the implementation.information_schema.column.crdb_sql_type
column should return pre-2.2 values until the INT --> INT8
upgrade ratchet has fired.~information_schema
and pg_class
logic tests to match pgsql for e.g. create table x(a int, aa integer, b int4, c int8, d bigint, e int2, f smallint);
when default_int_size=4
and document differences when in 8-byte mode.'1'::INT
result in the correct data width.default_int_size
is set.ColumnType
with Width == 0
is an error (e.g. types.go: FromColumnType()
.ColumnType.VisibleType
values, since we can always back them out of the non-zero `ColumnType.Width.There shouldn't be any need to alter the ColumnType
message. The only observable change for a 2.1 node is that 2.2 gateways would never create a ColumnType { SemanticType: INT, Width: 0 }
; the width would always be set to some explicit value.
If a column is created on a 2.1 gateway in the mixed-mode case, the upgrade ratchet will update the TableDescriptor
to act as though it had been created with INT --> INT8
semantics on a 2.2 gateway.
LGTM
Exception: When a column's default is set to an INT8-returning function, e.g. unique_rowid(), it should be an INT8.
When this rule is invoked we should emit a warning.
I recommend you file separate issues linked to this one for the various required bug fixes.
Exception: When a column's default is set to an INT8-returning function, e.g. unique_rowid(), it should be an INT8.
I would restrict that to the expression unique_rowid()
specifically, not just any INT expression.
LGTM otherwise! Thanks for the analysis.
@bdarnell
When this rule is invoked we should emit a warning.
We don't have infrastructure for pgwire warnings yet.
Personally for UX I would prefer introducing table/column comments (#19472) and put the warning in a comment instead.
Hello, thought I would chime in with my experiences with this as a new cockroachdb user.
I was trying to experiment with cockroachdb instead of postgres with a webapp I'm making, and ran into the problem of INT not being equivalent to INT4, specifically in the postgres SERIAL type.
The app is written with Rust, and using the diesel ORM so the types are pretty strictly enforced.
This seems like a very common setup in postgres/sql so would make migration easier... Here's a search on github showing over 200k instances of this:
https://github.com/search?q=%22+SERIAL+PRIMARY+KEY%22&type=Code
I'm wondering if there will ever be a way to have a SERIAL column in cockroachdb that is an INT4? Seems weird to advertise postgres compatibility when this isn't supported.
edit: my post came off a bit too negative. Now that I think about this some more it isn't too bad. I would just have to migrate my db to use bigserial and change my app code from i32 to i64 in a couple places. The downsides of that are that the ids in my urls are a bit longer (not that big of deal), and that I 'waste' 4 extra bytes per entry. I have fewer than a million rows so that change is pretty insignificant.
edit2: ran into another problem now. My ORM assumes that the returned value is the width of the data type. So i have an INT4 stored and it gets returned to me from cockroach as an INT8... Erasing some type data I had, and taking up more space. I don't need 64 bits to represent a number between 1-100
@nocduro we are working on it.
In the meantime, please know that we have introduced a partial compatibility mode in CockroachDB, so you can use INT4 with a sequence for SERIAL. See for yourself:
root@127.24.137.227:42763/defaultdb> set experimental_serial_normalization = sql_sequence;
SET
root@127.24.137.227:42763/defaultdb> create table t(x serial4);
CREATE TABLE
root@127.24.137.227:42763/defaultdb> show create t;
table_name | create_statement
+------------+-----------------------------------------------------------+
t | CREATE TABLE t (
| x INT4 NOT NULL DEFAULT nextval('t_x_seq1':::STRING),
| FAMILY "primary" (x, rowid)
| )
(1 row)
The gist of this is that you need 1) to set experimental_serial_normalization
and 2) use INT4, SERIAL4 etc explicitly (the word "INT" without a size will still automatically translate to INT8).
Regarding your final note:
edit2: ran into another problem now. My ORM assumes that the returned value is the width of the data type. So i have an INT4 stored and it gets returned to me from cockroach as an INT8...
Please provide more information. If this is a bug we need to fix it.
Status update:
Right now, I’ve explored three different ways of getting this INT change in:
opt
when converting from coltypes.T -> types.T -> coltypes.T
SELECT a::INT FROM (SELECT '4294967296' as a)
when in INT4-mode. The optimizer unwinds this to a DInt
and we lose the cast to what should be INT4 and a range-check error.Scanner
type, which implements sqlLexer
. The sql.y
grammar is changed so that receiving an INT token delegate to the Scanner.defaultIntType
field.conn
which adds the generated AST to a buffer to be drained by connExecutor
. I was running into problems finding a path from conn
to the SessionData
owned by connExecutor
.connExecutor
tree.Statement
and tree.Expr
types are currently hit-or-miss in terms of being able to run a visitor over them.Based on a chat yesterday with @jordanlewis, I think option 2 is once again viable since we have identified a way to get to the SessionData
from conn
.
Regardless of implementation complexity, option 2 would be the obvious approach if we could make a knife-edge cutover: We want to make INT
behave like FLOAT
in which we assign an explicit size to the type whenever you use it.
Open question:
For the record my initial attempt went towards solution 2, then I got stuck in a different place than the one you identify. So my suspicion is that there will be multiple obstacles there.
For solution 1 you can look at https://github.com/bobvawter/cockroach/pull/1 - I think we can push forward there.
I do think we need to spend time on solution 3 no matter what. I'd just like that work direction to be decoupled from our roadmap goals so it does not become time-pressured.
Meeting notes about solution 2.
ParseOne
and 10+ callers of Parse
pgwire
), this is what is mentioned in the comment above https://github.com/cockroachdb/cockroach/issues/26925#issuecomment-442054112coltypes.Int
altogether)cockroach sqlfmt
SHOW SYNTAX
For restore + import + show syntax
we need to plumb the parameter for correctness.
So for solution 2, overall action item list is:
For the plumbing into pgwire specifically,
handleParse
About solution 1:
CAST
, ANNOTATE_TYPE
and IS OF TYPE
-- all these go through a single path of TypeCheck
, where the naked INT could be eliminated (at that point)MakeColumnDefDescs
CAST
which needs to keep the result of the desugaring (ANNOTATE_TYPE and IS OF TYPE are eliminated during type checking), we need to keep the result of the transform.
opt
code the result must be translated to the scalar expression for CastExpr
- see change in https://github.com/bobvawter/cockroach/pull/1/files@knz what's the status of this item? Can we get some sort of checkbox for what has already been addressed?
Hey @knz - quick question. Customer just asked about whether or not this issue would address https://github.com/jOOQ/jOOQ/issues/8478. It looks like JOOQ's problem has to do with strings and names and I don't see them mentioned here.
@awoods187 I'll need to build an explanatory summary — with what works and what doesn't. Please come back to me to remind me, chances are I will lose track of this particular issue.
@tim-o the linked jOOQ issue is wholly unrelated. I have replied separately on the linked thread.
So, upon request by @andy-kimball and @awoods187 I have analyzed the situation further.
Generally, CockroachDB has stopped producing column descriptors with width=0 since 19.1 (thanks to Bob's and my changes).
However, thinking about mixed-version compatibility. Because a 2.0/2.1 node could still produce Width=0, a 19.1 node has to be able to understand it. In 19.2, this is not necessary any more so we can probably drop this support. However this requires a descriptor upgrade function, so that any descriptor created/stored pre-19.2 becomes upgraded to width=64. This works remain to be done.
As to the behavior in the future. There is a challenge we need to overcome, by providing a good solution. The problem to solve is a contradiction:
pg_catalog.pg_types.typname
~ information_schema.columns.data_type
) to come back out.int4
, and conveniently the catalog name of int4
is .... integer
. This means clients that enter "integer" in CREATE also expect "integer" in introspection.Meanwhile, the name "integer" in the input syntax in CockroachDB maps internally to the "udt" int8, and its catalog name is .... well ... we haven't decided that yet.
In both v2.0 and v2.1, we kept it to "int"/"integer" by using width=0, so that the introspection would match the input. In v19.1, we have broken the introspection by forcing a width.
(The contradiction is between the last two bullet points)
The way I see this is the following:
int
in the input syntax maps to some internal type (eg int8
) and have that internal type reflect back as integer
in introspection. I think this is a moderately easy step.~ (Edit: see my comment below)default_int_size
to also influence introspection somehow. Today, that flag changes the mapping from input syntax to internal type, but by doing so it breaks the input-introspection loop. We need to further change it so that the input-introspection loop is preserved for clients that are dead-set on 32-bit integers. When the flag is set, either the internal type int4
or int8
can map back to int
in introspection (not just int8
)@andy-kimball @awoods187 Note: the last column in the table above says "19.2" however I think it may be worth making this a bug fix and release it in a 19.1 patch release. The experience of clients will be rather uncomfortable until we fix that.
Erratum: in my explanation above the "catalog type name" shows up in information_Schema
, not in pg_catalog
. The remainder of the argument remains the same.
@andy-kimball for your education specifically: PostgreSQL has two separate introspection facilities for type names:
pg_catalog.pg_types.typname
shows the "OID name" for the type, this is used throughout pg_catalog
and is usually what's used by pg-specific applications. In that column, you'll see float8
when a column was created as FLOAT
, or _int2
if a column was created as INT2[]
.
information_schema.columns.data_type
shows the "information schema name" (what I called "catalog name" above) for the type. This follows the SQL standard names for types. In that column, you'll see double precision
when a column was created as FLOAT, or ARRAY
if a column was created as INT2[]
(other columns are used to further detail the type)
Check this out.
kena=> create table t(a int, b int4, c int8, d int2);
CREATE TABLE
kena=> select column_name,data_type from information_schema.columns where table_name='t';
column_name | data_type
-------------+-----------
a | integer
b | integer
c | bigint
d | smallint
(4 rows)
kena=> select a.attname, t.typname from pg_attribute a, pg_class c, pg_type t where a.attrelid=c.oid and c.relname='t' and a.atttypid=t.oid and a.attname in ('a','b','c','d');
attname | typname
---------+---------
a | int4
b | int4
c | int8
d | int2
(4 rows)
root@127.134.226.148:35327/defaultdb> create table t(a int, b int4, c int8, d int2);
CREATE TABLE
root@127.134.226.148:35327/defaultdb> select column_name,data_type from information_schema.columns where table_name='t';
column_name | data_type
+-------------+-----------+
a | bigint
b | integer
c | bigint
d | smallint
rowid | integer
(5 rows)
root@127.134.226.148:35327/defaultdb> select a.attname, t.typname from pg_attribute a, pg_class c, pg_type t where a.attrelid=c.oid and c.relname='t' and a.atttypid=t.oid and a.attname in ('a','b','c','d');
attname | typname
+---------+---------+
a | int8
b | int8
c | int8
d | int8
(4 rows)
(the pg_catalog
introspection is utterly broken)
(Also, the rowid
should be hidden in information_schema.columns
. That's an unrelated bug. I hadn't noticed before.)
root@127.0.0.1:39805/defaultdb> create table t(a int, b int4, c int8, d int2);
CREATE TABLE
root@127.0.0.1:39805/defaultdb> select column_name,data_type from information_schema.columns where table_name='t';
column_name | data_type
+-------------+-----------+
a | integer
b | integer
c | bigint
d | smallint
rowid | integer
(5 rows)
root@127.0.0.1:39805/defaultdb> select a.attname, t.typname from pg_attribute a, pg_class c, pg_type t where a.attrelid=c.oid and c.relname='t' and a.atttypid=t.oid and a.attname in ('a','b','c','d');
attname | typname
+---------+---------+
a | int8
b | int8
c | int8
d | int8
(4 rows)
The desired output will be, with default_int_size = 8
:
root@127.0.0.1:39805/defaultdb> select column_name,data_type from information_schema.columns where table_name='t';
column_name | data_type
+-------------+-----------+
a | bigint
b | integer
c | bigint
d | smallint
rowid | bigint
(5 rows)
root@127.0.0.1:39805/defaultdb> select a.attname, t.typname from pg_attribute a, pg_class c, pg_type t where a.attrelid=c.oid and c.relname='t' and a.atttypid=t.oid and a.attname in ('a','b','c','d');
attname | typname
+---------+---------+
a | int8
b | int4
c | int8
d | int2
(4 rows)
And when default_int_size = 4
:
root@127.0.0.1:39805/defaultdb> select column_name,data_type from information_schema.columns where table_name='t';
column_name | data_type
+-------------+-----------+
a | integer
b | integer
c | bigint
d | smallint
rowid | bigint
(5 rows)
root@127.0.0.1:39805/defaultdb> select a.attname, t.typname from pg_attribute a, pg_class c, pg_type t where a.attrelid=c.oid and c.relname='t' and a.atttypid=t.oid and a.attname in ('a','b','c','d');
attname | typname
+---------+---------+
a | int4
b | int4
c | int8
d | int2
(4 rows)
I am just noticing that CockroachDB 19.1 still produces Width=0 for the auto-generated rowid
column. That's not good. It will prevent removing that logic in 19.2 due to cross-version compatibility, unless we fix it in a 19.1 patch release.
@kena thanks for the detailed report as this is something i need to care about while maintaining support for CockroachDB driver in SQLBoiler ( https://github.com/glerchundi/sqlboiler-crdb).
Good job! 👏🏻
On Thu, 4 Apr 2019 at 11:16, kena notifications@github.com wrote:
I am just noticing that CockroachDB 19.1 still produces Width=0 for the auto-generated rowid column. That's not good. It will prevent removing that logic in 19.2 due to cross-version compatibility, unless we fix it in a 19.1 patch release.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cockroachdb/cockroach/issues/26925#issuecomment-479818466, or mute the thread https://github.com/notifications/unsubscribe-auth/ACIPlpYmKY6q2OOPZ_tzxRCZORJ88tmhks5vdcLtgaJpZM4UzrjC .
@andy-kimball @jordanlewis @knz what's left to do here after unifying the type changes?
@awoods187 there are a couple of small catalog changes that need to be made. So far I don't believe them to be particularly important.
The main decision point is whether or not we want to change what integer
means from int8
to int4
by default. I personally believe that we should not change this. I don't think it's a major sticking point, will be a headache, and people or tests who really need this (for some reason) can change the session variable.
I think it is a major sticking point - java and javascript are both sensitive to integer sizes, and session variables are awkward to use in many client drivers. I'm not sure whether it clears the bar for a significantly backwards-incompatible change, but I think maybe yes.
Note that it wouldn't make sense to change integer
to int4
without also changing serial
to serial4
(and therefore defaulting experimental_serial_normalization
to sql_sequence
). That has a big performance impact, and I'm more scared of making this change than just changing the integer
type.
Yes, they're sensitive to integer sizes, but since we now correctly report the type OIDs for all integer types (we didn't in 19.1), drivers will not have problems when they see an int8
for an integer
.
The grand types refactor had at least one nice benefit!
Could we leave the cluster default at int8
if it's an existing cluster, but default to int4
for new clusters? i.e. treat it as a cluster setting rather than a session setting
Could we leave the cluster default at int8 if it's an existing cluster, but default to int4 for new clusters? i.e. treat it as a cluster setting rather than a session setting
It's actually both a cluster and a session setting (the cluster setting is used as the default for the session setting), so that's exactly what would happen if we made this change. Existing clusters would continue to use integer=int8
while new ones would use integer=int4
. So we wouldn't break any existing clusters, but cause a divergence between older prod clusters and newer dev/test clusters.
Another potentially scary thing is integer
tables in a cluster that got upgraded from 19.1 to 19.2 would have a different width than integer
tables that got created in the same cluster after the upgrade was finalized. I think that it could be rather confusing for applications.
I still think that making this change would be more disruptive than helpful. Note that I'm assuming that drivers work properly because of the OID change I mentioned above. I haven't seen any evidence that refutes that assumption - if there were such evidence I think I'd need to reconsider.
I think we should consider defaulting to int4 as the vast majority of our users are still in front of us. We should have a bias towards compatibility and toward forward-looking as long as we have escape hatches for users upgrading.
I still haven't found a compelling, real user problem caused by the mapping of integer
to int8
instead of int4
by default. The real compatibility bugs that were discussed in this thread are solved now, as far as I know, but I'd love to see evidence to the contrary.
Concrete example: the migration tool in knex
(a javascript query builder). It creates a table with an INTEGER
column to track its state. Since javascript doesn't have a 64-bit integer type, the pg
driver converts INT8
values into decimal strings, but knex
expects javascript number objects. It is therefore impossible to use the knex
migration tool out of the box unless you set the cluster setting to make integer=int4
.
This isn't a clear-cut case - the cluster setting workaround works, and even that could be avoided if the tool were changed to specify a smaller integer type (but this is tricky and DB-specific - int4
isn't available everywhere). The SQL standard doesn't specify the size of the standard types SMALLINT, INTEGER, and BIGINT, but JDBC gives us a de facto standard: SMALLINT is 16 bits, INTEGER is 32, and BIGINT is 64. Doing anything else will be an ongoing source of friction since anyone who uses java or JS will need to either change their schemas or set this cluster/session setting. And as Andy says, the majority of our usage is still in front of us.
Okay, that's compelling enough.
Note that it wouldn't make sense to change integer to int4 without also changing serial to serial4 (and therefore defaulting experimental_serial_normalization to sql_sequence). That has a big performance impact, and I'm more scared of making this change than just changing the integer type.
Why doesn't it makes sense to change just integer
?
This example used an INTEGER
, but SERIAL
is also very common and raises exactly the same issues. Changing one but not the other means that we cause the backwards-compatibility headaches for existing CockroachDB users, but we also have to continue to recommend the cluster/session settings for future users. I think if we're going to make a backwards-incompatible change for the sake of better compatibility with other databases, we should go all the way and change both types.
@rafiss @jordanlewis should this be tracked in the @cockroachdb/sql-experience board?
We definitely shouldn't go and change the width of integers on already defined table data. If we retain the current default value for upgraded clusters and bootstrap new clusters with the new default, I think that could go a long way towards achieving the goals and not breaking too much backwards compat. I vote we go with sql_sequence_cached
on the serial normalization. On the whole, I support the movement and does make sense for compat, but we should retain our escape hatch.
Something to be aware of with regards to compatibility (both here and elsewhere): we don't just need to be backward compatible on existing clusters, but also for new clusters that are expected to have the same behavior as an existing cluster. Consider a common setup: a production cluster, a staging cluster, and N development clusters running locally on engineer's workstations. The production cluster is likely to have been created a long time ago and upgraded through many major versions. Similar story for the staging cluster, though it may get wiped and recreated periodically if something goes horrifically wrong. The local development clusters will get wiped frequently. And if engineer's didn't wipe their development clusters frequently, new engineers would create new local development clusters which have definitely not gone through the same upgrade path as the clusters of other engineers. This isn't a theoretical problem. The difference in the history of clusters has caused migrations on the CC control plane DB to behave differently.
I didn't read the whole discussion but just wanted to say that through JDBC, when asking for the column metadata of a integer array
column, it will report the column type _int8
whereas it should report _int4
. Kind of annoying in Hibernate tests that rely on proper types being reported, but no road blocker.
@beikov you can influence this by either:
default_int_size
int4[]
explicitly (instead of integer[]
).Thanks for the hint. I'll adapt the CockroachDialect
in Hibernate then as part of https://hibernate.atlassian.net/browse/HHH-15528
The next issue I have run into is that serial4
produces a value that is too big to fit into a 4 byte integer. With a table:
create table TriggerEntity (
id serial4 not null,
name varchar(255),
primary key (id)
)
extracting the generated value with ResultSet#getInt
results in:
Caused by: org.postgresql.util.PSQLException: Bad value for type int : 798864890501038081
at app//org.postgresql.jdbc.PgResultSet.toInt(PgResultSet.java:3218)
at app//org.postgresql.jdbc.PgResultSet.getInt(PgResultSet.java:2408)
at app//org.postgresql.jdbc.PgResultSet.getInt(PgResultSet.java:2836)
Pseudo-code:
var stmt = connection.prepareStatement( "insert into TriggerEntity(name) values ('abc')", PreparedStatement.RETURN_GENERATED_KEYS );
stmt.executeUpdate();
var rs = stmt.getGeneratedKeys();
rs.next();
var id = rs.getInt( 1 );
If you run the create table
in an interactive session, you'd see what CockroachDB thinks about serial4
:
NOTICE: upgrading the column x to INT8 to utilize the session serial_normalization setting
HINT: change the serial_normalization to sql_sequence or sql_sequence_cached if you wish to use
a smaller sized serial column at the cost of performance.
See https://www.cockroachlabs.com/docs/v22.1/serial.html
If you want serial4 to behave exactly like in postgres, at the expense of performance, there's some options. I encourage you to peruse the linked documentation page.
int
int4
, never shows up in vtablesint2
int
(thus wrong width), shows up as "int2" in certain vtablesint4
int
(thus wrong width), shows up as "int4" in certain vtablesint8
int
, shows up as "int8" in certain vtablesbigint
int
, shows up as "bigint" in certain vtablesint8
, never shows up in vtablessmallint
int
(thus wrong width) shows up as "smallint" in certain vtablesint2
, never shows up in vtablesserial
int
(thus wrong width) withdefault unique_rowid()
int4
, create sequence anddefault nextval(seqname)
bigserial
serial
int8
, create sequence anddefault nextval(seqname)
smallserial
serial
(thus wrong width)int2
, create sequence anddefault nextval(seqname)
bit
int
, shows up asbit
in certain vtables, max 64-bitProblems:
int
,int2
,int4
,serial
,smallserial
,bit
have data width that are fully incompatible with postgresqlserial
types uses sequences, CockroachDB does not do thisbit
has incorrectly a maximum size (and its underlying implementation is currently incorrect)Fixing this would comprehensively adress #24062 #22607 #26730 #25098 #24686 and possibly others.
Informs #26128.
Jira issue: CRDB-4979