Closed hemidactylus closed 4 months ago
We probably also need to test for other similar cases like if the table name is also same as the cql data type like below:
token@cqlsh:correctness> create table int(int int primary key);
token@cqlsh:correctness> desc tables;
int vector
token@cqlsh:correctness> desc table int;
CREATE TABLE correctness.int (
int int PRIMARY KEY
);
Agreed @msmygit . My guess is that I got something in the CQL parser code, specifically we're trying to pull out some kind of non-keyword token and failing miserably.
Any fix to this ticket should involve a test to validate that we've addressed the issue generally.
I have checked with INT
and TEXT
and ... they work just fine. So it seems it's something with vector
specifically.
java -jar dsbulk-1.11.0.jar unload -k ... -t a1 -u "token" -p ... -b ...
Username and password provided but auth provider not specified, inferring PlainTextAuthProvider
A cloud secure connect bundle was provided: ignoring all explicit contact points.
Operation directory: ...
int
12
11
10
total | failed | rows/s | p50ms | p99ms | p999ms
3 | 0 | 2 | 119.80 | 121.11 | 121.11
Operation UNLOAD_20230928-155343-102056 completed successfully in 1 second.
Checkpoints for the current operation were written to checkpoint.csv.
To resume the current operation, re-run it with the same settings, and add the following command line flag:
--dsbulk.log.checkpoint.file=...
java -jar dsbulk-1.11.0.jar unload -k ... -t a2 -u "token" -p ... -b ...
Username and password provided but auth provider not specified, inferring PlainTextAuthProvider
A cloud secure connect bundle was provided: ignoring all explicit contact points.
Operation directory: ...
text
t3
t1
t2
total | failed | rows/s | p50ms | p99ms | p999ms
3 | 0 | 2 | 119.63 | 120.59 | 120.59
Operation UNLOAD_20230928-155431-632878 completed successfully in 1 second.
Checkpoints for the current operation were written to checkpoint.csv.
To resume the current operation, re-run it with the same settings, and add the following command line flag:
--dsbulk.log.checkpoint.file=...
In the meantime @phact has brilliantly found a workaround:
-query 'SELECT row_id, attributes_blob, body_blob, metadata_s, \"vector\" FROM poc_data.product_table'
I got the same above issue with the vector column named as vector. Some of our libraries names the vector column as such, so that makes it imp to address this issue.
Adding details for the insert
workaround as well as its a bit tricky
-query 'insert into <ks>.<table> (row_id, attributes_blob, body_blob, metadata_s, \"vector\") values (:row_id, :attributes_blob, :body_blob, :metadata_s, :\"vector\")'
On a schema like this
it happens that
dsbulk unload
gets confused byvector
and the unload operation fails with something likev1.11