Open kbr-scylla opened 3 months ago
First, I think you are using quite an old Rust Driver version. Current is 0.13 and 0.11 released in December introduced new serialization traits that perform type checking on the client side. Currently your first Rust example fails with:
Serializing values failed: SerializationError: Failed to serialize query arguments (alloc::vec::Vec<u8>, i32): failed to serialize column b: SerializationError: Failed to type check Rust type alloc::vec::Vec<u8> against CQL type Int: expected one of the CQL types: [Blob]
and second (after slight modifications to make it work with current version) fails with:
Serializing values failed: SerializationError: Failed to serialize query arguments (&testrow::main::{{closure}}::Typ2, i32): failed to serialize column x: SerializationError: Failed to type check Rust type testrow::main::{{closure}}::Typ2 against CQL type UserDefinedType { type_name: "typ", keyspace: "ks", field_types: [("a", Int)] }: the field b is missing in the Rust data but is required by the CQL UDT type
which btw seems like a problem with error message: it says that field b is missing in Rust data, but it's missing in CQL UDT - so it's the other way around than message says.
Now to address the issue: to the best of my knowledge what you described is how all the drivers work - metadata in prepared statements is constant and user needs to take care of updating this metadata by creating the prepared statement again. I see some reasons for that:
We could discuss doing it in different way if you want, but I suspect it would require significant, possibly backwards incompatible, changes in the driver.
Regarding your proposed fix: prepared statement is a shared object that can be used concurrently in multiple queries. It's not hard to imagine scenario where driver sends a query to node A, updates metadata, concurrently sends to B which has older schema and updates metadata back - not allowing you to send statement using new data type.
If you add a column then you need to modify you queries to make use of it. If you change type of column then you need to change type of what you use in queries.
That would mean disruption in availability.
Adding column: obviously you don't have to change the application, you can keep omitting this column from INSERT or UPDATE, until you are ready to deploy new version of your app. But this takes time. Schema change can be done ahead of time.
Changing column type to a compatible one, in particular, adding new field to a user-defined-type: same argument.
Schema change is not an atomic event iiuc and I don't see how exactly should the driver deal with different nodes having temporarily different schema versions.
Driver has a pool of connections. A given shard has a single schema version at a given time. Given connections acts according to that version.
The following test written for Scylla's test.py framework fails:
with:
In this case (I investigated), the Python driver does not attempt a statement reprepare at all. This is problem number 1.
If we uncomment:
(second-to-last statement), this triggers a reprepare (which I checked by adding some logs to the driver), because driver sends a request and gets a "query not prepared" response from Scylla. This statement passes. However, the following statement:
still fails, even after repreparation. This is problem number 2.
If we uncomment the explicit
(after the
alter
), then the last statement passes (the one which sends[b'', 0]
), whether or not we uncomment the the previous statement (the one which sends[0, 0]
).I found an easy way to solve problem number 2, by adjusting the reprepare code to update
column_metadata
inside thePreparedStatement
object (following the logic used whenPreparedStatement
is first created byprepare
:column_metadata
is set to the responsebind_metadata
):(this is in
_execute_after_prepare
function in cassandra/cluster.py).(BTW. the names are weird, yes)
However I don't know how to solve problem 1. In problem 1, the driver does not even attempt a reprepare, because it fails on serialization stage before even sending the request, due to outdated
column_metadata
which it apparently uses to do the serialization.The problems have different symptoms (although the root causes are the same), if instead changing the type of a column using
alter ks.t alter x type ...
, we introduce a user-defined type, and then add a column to this type. Consider this test:There are no failures. However, the result is:
[Row(pk=1, x=typ(a=0, b=None)), Row(pk=0, x=typ(a=0, b=None)), Row(pk=2, x=typ(a=0, b=None))]
So
b
isNone
forpk=1
andpk=2
, even though we boundTyp2(0, 0)
when updating those keys.After my
column_metadata
fix, the result changes to:[Row(pk=1, x=typ(a=0, b=None)), Row(pk=0, x=typ(a=0, b=None)), Row(pk=2, x=typ(a=0, b=0))]
So for
pk=1
the result is stillNone
. Even though this statement causes a repreparation; the driver should in theory resend the query with updatedcolumn_metadata
sob
should get updated forpk=1
too.But for
pk=2
the result is correct -- apparently this time the updatedcolumn_metadata
helped.Uncommenting the explicit
prepare
also helps.So is this a bug?
Well, at least Rust driver does provide a much better user experience. This (executed against pre-existing single-node cluster):
works out of the box, and gives
looking at the code, apparently Rust driver does not have a corresponding structure to Python driver's
column_metadata
/bind_metadata
to be used for serialization; instead, it seems to use the types coming with the tuple. And the first statement after the ALTER reprepares, but does not seem to update theinsert
object, which apparently is not necessary for this driver.Similarly the experience is better with UDTs:
result:
cc @Lorak-mmk