datastax / dsbulk

DataStax Bulk Loader (DSBulk) is an open-source, Apache-licensed, unified tool for loading into and unloading from Apache Cassandra(R), DataStax Astra and DataStax Enterprise (DSE)
Apache License 2.0
85 stars 30 forks source link

dsbulk doesn't support toUnixTimestamp? #482

Open kokizzu opened 1 year ago

kokizzu commented 1 year ago

trying dsbulk with query:

SELECT COUNT(1), SUM(size) FROM dummy WHERE ck1='%s' AND pk1='%s' AND toUnixTimestamp(ts)<%d ALLOW FILTERING

got 1:116: no viable alternative at input 'toUnixTimestamp('. Caused by: NoViableAltException (no message).

msmygit commented 1 year ago

Please post your table schema here.

kokizzu commented 1 year ago

something like this

CREATE TABLE dummy (
    ck1 text,
    pk1 text,
    size bigint,
    ts timeuuid
    PRIMARY KEY ((ck1), pk1)
) WITH CLUSTERING ORDER BY (pk1 DESC);
msmygit commented 1 year ago

I tested this out it works (even if I use lowecase function name or toUnixTimestamp. Also, tested when ck1 is part of the clustering key too).

My Table Schema

token@cqlsh:correctness> DESC TABLE dummy ;

CREATE TABLE correctness.dummy (
    pk1 text PRIMARY KEY,
    ck1 text,
    size bigint,
    tuuid timeuuid
);

My table's data

token@cqlsh:correctness> SELECT * FROM dummy ;

 pk1 | ck1 | size | tuuid
-----+-----+------+--------------------------------------
   1 |   1 |    1 | a1b5f140-5e0f-11ee-b5a9-e5b440ad607a

(1 rows)
token@cqlsh:correctness> 

DSBulk's 1.11.0 unload query:

 % ./dsbulk unload -b /path/to/secure-connect-vector-correctness.zip -u REDACTED -p REDACTED -query "SELECT COUNT(1), SUM(size), tounixtimestamp(tuuid) FROM correctness.dummy WHERE ck1='1' AND pk1='1'  ALLOW FILTERING" -url /path/to/tounix
Username and password provided but auth provider not specified, inferring PlainTextAuthProvider
A cloud secure connect bundle was provided: ignoring all explicit contact points.
Operation directory: /path/to/dsbulk-1.11.0/bin/logs/UNLOAD_20230928-165928-640482
total | failed | rows/s |  p50ms |  p99ms | p999ms
    1 |      0 |      2 | 120.32 | 120.59 | 120.59
Operation UNLOAD_20230928-165928-640482 completed successfully in less than one second.
Checkpoints for the current operation were written to checkpoint.csv.
To resume the current operation, re-run it with the same settings, and add the following command line flag:
--dsbulk.log.checkpoint.file=/path/to/dsbulk-1.11.0/bin/logs/UNLOAD_20230928-165928-640482/checkpoint.csv

What's the content of the resulting CSV output

% cat /path/to/tounix/output-000001.csv 
count,system.sum(size),system.tounixtimestamp(tuuid)
1,1,1695913172564
kokizzu commented 1 year ago

ok bug then?