scylladb / python-driver

ScyllaDB Python Driver, originally DataStax Python Driver for Apache Cassandra
https://python-driver.docs.scylladb.com
Apache License 2.0
74 stars 42 forks source link

"SELECT * FROM ks1.table1" fails: Failed decoding result column "ck2" of type timestamp: days=2147458792; must have magnitude <= 999999999 #255

Closed vponomaryov closed 5 months ago

vponomaryov commented 1 year ago

Issue description

Running the rebuild_streaming_err nemesis in the gemini CI job using gemini-v1.8.6, where we walk through the test tables and check for data presence, we get following error:

> Executing CQL '<SimpleStatement query="SELECT * FROM ks1.table1", consistency=Not Set>' ...
> Error decoding response from Cassandra. ver(3); flags(0000); stream(2); op(8); offset(9); len(145621); buffer: b'\x83 ... <huge piece of data> ... \x15'
> Traceback (most recent call last):
>   File "cassandra/obj_parser.pyx", line 72, in cassandra.obj_parser.TupleRowParser.unpack_row
>   File "cassandra/deserializers.pxd", line 41, in cassandra.deserializers.from_binary
>   File "cassandra/deserializers.pyx", line 143, in cassandra.deserializers.DesDateType.deserialize
>   File "cassandra/cython_utils.pyx", line 62, in cassandra.cython_utils.datetime_from_timestamp
>   File ".eggs/Cython-0.29.32-py3.10-linux-x86_64.egg/Cython/Includes/cpython/datetime.pxd", line 125, in cpython.datetime.timedelta_new
> OverflowError: days=2147458792; must have magnitude <= 999999999
>
> During handling of the above exception, another exception occurred:
>
> Traceback (most recent call last):
>   File "cassandra/row_parser.pyx", line 38, in cassandra.row_parser.make_recv_results_rows.recv_results_rows
>   File "cassandra/obj_parser.pyx", line 31, in cassandra.obj_parser.ListParser.parse_rows
>   File "cassandra/obj_parser.pyx", line 74, in cassandra.obj_parser.TupleRowParser.unpack_row
> cassandra.DriverException: Failed decoding result column "ck2" of type timestamp: days=2147458792; must have magnitude <= 999999999
>
> During handling of the above exception, another exception occurred:
>
> Traceback (most recent call last):
>   File "cassandra/obj_parser.pyx", line 72, in cassandra.obj_parser.TupleRowParser.unpack_row
>   File "cassandra/deserializers.pxd", line 41, in cassandra.deserializers.from_binary
>   File "cassandra/deserializers.pyx", line 143, in cassandra.deserializers.DesDateType.deserialize
>   File "cassandra/cython_utils.pyx", line 62, in cassandra.cython_utils.datetime_from_timestamp
>   File ".eggs/Cython-0.29.32-py3.10-linux-x86_64.egg/Cython/Includes/cpython/datetime.pxd", line 125, in cpython.datetime.timedelta_new
> OverflowError: days=2147458792; must have magnitude <= 999999999
>
> During handling of the above exception, another exception occurred:
>
> Traceback (most recent call last):
>   File "cassandra/connection.py", line 1266, in cassandra.connection.Connection.process_msg
>   File "cassandra/protocol.py", line 1196, in cassandra.protocol._ProtocolHandler.decode_message
>   File "cassandra/protocol.py", line 744, in cassandra.protocol.ResultMessage.recv_body
>   File "cassandra/protocol.py", line 730, in cassandra.protocol.ResultMessage.recv
>   File "cassandra/row_parser.pyx", line 46, in cassandra.row_parser.make_recv_results_rows.recv_results_rows
>   File "cassandra/obj_parser.pyx", line 57, in cassandra.obj_parser.TupleRowParser.unpack_row
>   File "cassandra/obj_parser.pyx", line 74, in cassandra.obj_parser.TupleRowParser.unpack_row
> cassandra.DriverException: Failed decoding result column "ck2" of type timestamp: days=2147458792; must have magnitude <= 999999999

Impact

Describe the impact this issue causes to the user.

How frequently does it reproduce?

Happens in about 50% cases using gemini-v1.8.6 in the gemini CI job if no data mismatches get caught.

Installation details

Kernel Version: 5.15.0-1040-aws Scylla version (or git commit hash): 5.2.6-20230730.58acf071bf28 with build-id 17961be569f8503b27ff284a8de1e00a9d83811e

Cluster size: 3 nodes (i3.large)

Scylla Nodes used in this run:

OS / Image: ami-0f1f419420235978b (aws: us-east-1)

Test: vp-gemini-3h-with-nemesis-test Test id: 985fe5dc-f634-47d6-b517-814916df7a5e Test name: scylla-staging/valerii/vp-gemini-3h-with-nemesis-test Test config file(s):

Logs and commands - Restore Monitor Stack command: `$ hydra investigate show-monitor 985fe5dc-f634-47d6-b517-814916df7a5e` - Restore monitor on AWS instance using [Jenkins job](https://jenkins.scylladb.com/view/QA/job/QA-tools/job/hydra-show-monitor/parambuild/?test_id=985fe5dc-f634-47d6-b517-814916df7a5e) - Show all stored logs command: `$ hydra investigate show-logs 985fe5dc-f634-47d6-b517-814916df7a5e` ## Logs: - **db-cluster-985fe5dc.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/985fe5dc-f634-47d6-b517-814916df7a5e/20230804_014205/db-cluster-985fe5dc.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/985fe5dc-f634-47d6-b517-814916df7a5e/20230804_014205/db-cluster-985fe5dc.tar.gz) - **sct-runner-events-985fe5dc.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/985fe5dc-f634-47d6-b517-814916df7a5e/20230804_014205/sct-runner-events-985fe5dc.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/985fe5dc-f634-47d6-b517-814916df7a5e/20230804_014205/sct-runner-events-985fe5dc.tar.gz) - **sct-985fe5dc.log.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/985fe5dc-f634-47d6-b517-814916df7a5e/20230804_014205/sct-985fe5dc.log.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/985fe5dc-f634-47d6-b517-814916df7a5e/20230804_014205/sct-985fe5dc.log.tar.gz) - **monitor-set-985fe5dc.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/985fe5dc-f634-47d6-b517-814916df7a5e/20230804_014205/monitor-set-985fe5dc.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/985fe5dc-f634-47d6-b517-814916df7a5e/20230804_014205/monitor-set-985fe5dc.tar.gz) - **loader-set-985fe5dc.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/985fe5dc-f634-47d6-b517-814916df7a5e/20230804_014205/loader-set-985fe5dc.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/985fe5dc-f634-47d6-b517-814916df7a5e/20230804_014205/loader-set-985fe5dc.tar.gz) - **parallel-timelines-report-985fe5dc.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/985fe5dc-f634-47d6-b517-814916df7a5e/20230804_014205/parallel-timelines-report-985fe5dc.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/985fe5dc-f634-47d6-b517-814916df7a5e/20230804_014205/parallel-timelines-report-985fe5dc.tar.gz) [Jenkins job URL](https://jenkins.scylladb.com/job/scylla-staging/job/valerii/job/vp-gemini-3h-with-nemesis-test/15/) [Argus](https://argus.scylladb.com/test/def77ee5-c8b1-41b4-b912-17ce86aca11e/runs?additionalRuns[]=985fe5dc-f634-47d6-b517-814916df7a5e)
vponomaryov commented 1 year ago

Since it is flaky failure. Here is another passed test run with completely the same setting, but without bug reproduction:

Installation details

Kernel Version: 5.15.0-1040-aws Scylla version (or git commit hash): 5.2.6-20230730.58acf071bf28 with build-id 17961be569f8503b27ff284a8de1e00a9d83811e

Cluster size: 3 nodes (i3.large)

Scylla Nodes used in this run:

OS / Image: ami-0f1f419420235978b (aws: us-east-1)

Test: vp-gemini-3h-with-nemesis-test Test id: 04811d3b-6563-4cb7-9336-e454663457ce Test name: scylla-staging/valerii/vp-gemini-3h-with-nemesis-test Test config file(s):

Logs and commands - Restore Monitor Stack command: `$ hydra investigate show-monitor 04811d3b-6563-4cb7-9336-e454663457ce` - Restore monitor on AWS instance using [Jenkins job](https://jenkins.scylladb.com/view/QA/job/QA-tools/job/hydra-show-monitor/parambuild/?test_id=04811d3b-6563-4cb7-9336-e454663457ce) - Show all stored logs command: `$ hydra investigate show-logs 04811d3b-6563-4cb7-9336-e454663457ce` ## Logs: - **db-cluster-04811d3b.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/04811d3b-6563-4cb7-9336-e454663457ce/20230804_111719/db-cluster-04811d3b.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/04811d3b-6563-4cb7-9336-e454663457ce/20230804_111719/db-cluster-04811d3b.tar.gz) - **sct-runner-events-04811d3b.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/04811d3b-6563-4cb7-9336-e454663457ce/20230804_111719/sct-runner-events-04811d3b.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/04811d3b-6563-4cb7-9336-e454663457ce/20230804_111719/sct-runner-events-04811d3b.tar.gz) - **sct-04811d3b.log.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/04811d3b-6563-4cb7-9336-e454663457ce/20230804_111719/sct-04811d3b.log.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/04811d3b-6563-4cb7-9336-e454663457ce/20230804_111719/sct-04811d3b.log.tar.gz) - **monitor-set-04811d3b.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/04811d3b-6563-4cb7-9336-e454663457ce/20230804_111719/monitor-set-04811d3b.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/04811d3b-6563-4cb7-9336-e454663457ce/20230804_111719/monitor-set-04811d3b.tar.gz) - **loader-set-04811d3b.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/04811d3b-6563-4cb7-9336-e454663457ce/20230804_111719/loader-set-04811d3b.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/04811d3b-6563-4cb7-9336-e454663457ce/20230804_111719/loader-set-04811d3b.tar.gz) - **parallel-timelines-report-04811d3b.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/04811d3b-6563-4cb7-9336-e454663457ce/20230804_111719/parallel-timelines-report-04811d3b.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/04811d3b-6563-4cb7-9336-e454663457ce/20230804_111719/parallel-timelines-report-04811d3b.tar.gz) [Jenkins job URL](https://jenkins.scylladb.com/job/scylla-staging/job/valerii/job/vp-gemini-3h-with-nemesis-test/22/) [Argus](https://argus.scylladb.com/test/def77ee5-c8b1-41b4-b912-17ce86aca11e/runs?additionalRuns[]=04811d3b-6563-4cb7-9336-e454663457ce)
vponomaryov commented 1 year ago

The master scylla branch is also affected:

> Executing CQL '<SimpleStatement query="SELECT * FROM ks1.table1", consistency=Not Set>' ...
> Error decoding response from Cassandra. ver(3); flags(0000); stream(2); op(8); offset(9); len(151051); buffer: b'\x83 ... <huge piece of data> ... \x04:J\x96R'
> Traceback (most recent call last):
>   File "cassandra/obj_parser.pyx", line 72, in cassandra.obj_parser.TupleRowParser.unpack_row
>   File "cassandra/deserializers.pxd", line 41, in cassandra.deserializers.from_binary
>   File "cassandra/deserializers.pyx", line 143, in cassandra.deserializers.DesDateType.deserialize
>   File "cassandra/cython_utils.pyx", line 62, in cassandra.cython_utils.datetime_from_timestamp
>   File ".eggs/Cython-0.29.32-py3.10-linux-x86_64.egg/Cython/Includes/cpython/datetime.pxd", line 125, in cpython.datetime.timedelta_new
> OverflowError: days=2147458792; must have magnitude <= 999999999
>
> During handling of the above exception, another exception occurred:
>
> Traceback (most recent call last):
>   File "cassandra/row_parser.pyx", line 38, in cassandra.row_parser.make_recv_results_rows.recv_results_rows
>   File "cassandra/obj_parser.pyx", line 31, in cassandra.obj_parser.ListParser.parse_rows
>   File "cassandra/obj_parser.pyx", line 74, in cassandra.obj_parser.TupleRowParser.unpack_row
> cassandra.DriverException: Failed decoding result column "ck1" of type timestamp: days=2147458792; must have magnitude <= 999999999
>
> During handling of the above exception, another exception occurred:
>
> Traceback (most recent call last):
>   File "cassandra/obj_parser.pyx", line 72, in cassandra.obj_parser.TupleRowParser.unpack_row
>   File "cassandra/deserializers.pxd", line 41, in cassandra.deserializers.from_binary
>   File "cassandra/deserializers.pyx", line 143, in cassandra.deserializers.DesDateType.deserialize
>   File "cassandra/cython_utils.pyx", line 62, in cassandra.cython_utils.datetime_from_timestamp
>   File ".eggs/Cython-0.29.32-py3.10-linux-x86_64.egg/Cython/Includes/cpython/datetime.pxd", line 125, in cpython.datetime.timedelta_new
> OverflowError: days=2147458792; must have magnitude <= 999999999
>
> During handling of the above exception, another exception occurred:
>
> Traceback (most recent call last):
>   File "cassandra/connection.py", line 1266, in cassandra.connection.Connection.process_msg
>   File "cassandra/protocol.py", line 1196, in cassandra.protocol._ProtocolHandler.decode_message
>   File "cassandra/protocol.py", line 744, in cassandra.protocol.ResultMessage.recv_body
>   File "cassandra/protocol.py", line 730, in cassandra.protocol.ResultMessage.recv
>   File "cassandra/row_parser.pyx", line 46, in cassandra.row_parser.make_recv_results_rows.recv_results_rows
>   File "cassandra/obj_parser.pyx", line 57, in cassandra.obj_parser.TupleRowParser.unpack_row
>   File "cassandra/obj_parser.pyx", line 74, in cassandra.obj_parser.TupleRowParser.unpack_row
> cassandra.DriverException: Failed decoding result column "ck1" of type timestamp: days=2147458792; must have magnitude <= 999999999

Installation details

Kernel Version: 5.15.0-1039-aws Scylla version (or git commit hash): 5.4.0~dev-20230702.1ab2bb69b8a6 with build-id 748a09783d18c97d903f62dcf4d96a6d9db4b527

Cluster size: 3 nodes (i3.large)

Scylla Nodes used in this run:

OS / Image: ami-027d7b085be72f13c (aws: us-east-1)

Test: vp-gemini-3h-with-nemesis-test Test id: bf42fd21-7f82-45e9-9424-ec362a75e473 Test name: scylla-staging/valerii/vp-gemini-3h-with-nemesis-test Test config file(s):

Logs and commands - Restore Monitor Stack command: `$ hydra investigate show-monitor bf42fd21-7f82-45e9-9424-ec362a75e473` - Restore monitor on AWS instance using [Jenkins job](https://jenkins.scylladb.com/view/QA/job/QA-tools/job/hydra-show-monitor/parambuild/?test_id=bf42fd21-7f82-45e9-9424-ec362a75e473) - Show all stored logs command: `$ hydra investigate show-logs bf42fd21-7f82-45e9-9424-ec362a75e473` ## Logs: - **db-cluster-bf42fd21.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/bf42fd21-7f82-45e9-9424-ec362a75e473/20230804_120455/db-cluster-bf42fd21.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/bf42fd21-7f82-45e9-9424-ec362a75e473/20230804_120455/db-cluster-bf42fd21.tar.gz) - **sct-runner-events-bf42fd21.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/bf42fd21-7f82-45e9-9424-ec362a75e473/20230804_120455/sct-runner-events-bf42fd21.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/bf42fd21-7f82-45e9-9424-ec362a75e473/20230804_120455/sct-runner-events-bf42fd21.tar.gz) - **sct-bf42fd21.log.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/bf42fd21-7f82-45e9-9424-ec362a75e473/20230804_120455/sct-bf42fd21.log.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/bf42fd21-7f82-45e9-9424-ec362a75e473/20230804_120455/sct-bf42fd21.log.tar.gz) - **monitor-set-bf42fd21.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/bf42fd21-7f82-45e9-9424-ec362a75e473/20230804_120455/monitor-set-bf42fd21.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/bf42fd21-7f82-45e9-9424-ec362a75e473/20230804_120455/monitor-set-bf42fd21.tar.gz) - **loader-set-bf42fd21.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/bf42fd21-7f82-45e9-9424-ec362a75e473/20230804_120455/loader-set-bf42fd21.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/bf42fd21-7f82-45e9-9424-ec362a75e473/20230804_120455/loader-set-bf42fd21.tar.gz) - **parallel-timelines-report-bf42fd21.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/bf42fd21-7f82-45e9-9424-ec362a75e473/20230804_120455/parallel-timelines-report-bf42fd21.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/bf42fd21-7f82-45e9-9424-ec362a75e473/20230804_120455/parallel-timelines-report-bf42fd21.tar.gz) [Jenkins job URL](https://jenkins.scylladb.com/job/scylla-staging/job/valerii/job/vp-gemini-3h-with-nemesis-test/25/) [Argus](https://argus.scylladb.com/test/def77ee5-c8b1-41b4-b912-17ce86aca11e/runs?additionalRuns[]=bf42fd21-7f82-45e9-9424-ec362a75e473)
vponomaryov commented 1 year ago

Looks very similar to the following closed bug: https://github.com/scylladb/scylladb/issues/12239

fruch commented 1 year ago

Looks like a duplicate of https://github.com/scylladb/scylladb/issues/14645

It was fixed in Gemini, and not sure if it's in 1.8.6 version

nyh commented 1 year ago

It seems this issue is about the "timestamp", the other was about "time".

DoronArazii commented 1 year ago

@eliransin please assign it, adding it as a blocker until we will have some diagnostic from your team.

mykaul commented 1 year ago

@vponomaryov - can you try with gemini pre 1.8.4, to see if it happens? I'm quite sure it's a change in Gemini. It may or may not be a bug that we need to fix in Scylla side, but I'd like to ensure it's not a regression, first of all. Alternatively, test with an older release of Scylla, to prove it's not a recent regression.

vponomaryov commented 1 year ago

@vponomaryov - can you try with gemini pre 1.8.4, to see if it happens? I'm quite sure it's a change in Gemini. It may or may not be a bug that we need to fix in Scylla side, but I'd like to ensure it's not a regression, first of all. Alternatively, test with an older release of Scylla, to prove it's not a recent regression.

Yes, this bug was caught using the v1.8.6 gemini version only. Before, we were using 1.7.8 version of the gemini. It became unstable in a different way, probably, due to new versions of other tools including docker.

Either Scylla or python driver have bug: Scylla allows to write values which driver complains about. Gemini should have failed with impossibility to write data, but it passed.

From the try 1.8.4- gemini version point of view I can say that I had successful run using 1.7.8 and lots of failures with another kind of an error. Also, tried 1.8.2, failed with one another error, not this one.

To summarize:

DoronArazii commented 1 year ago

@cvybhu is this in your plans for this Q?

cvybhu commented 1 year ago

@cvybhu is this in your plans for this Q?

No, I haven't seen this before.

To me this looks like a python driver bug.

The rust driver allows to insert and select a timestamp equal to i64::MAX (2^63 - 1) and i64::MIN.

use scylla::{Session, SessionBuilder};

#[tokio::main]
async fn main() {
    let session: Session = SessionBuilder::new()
        .known_node("127.0.0.1:9042")
        .build()
        .await
        .unwrap();

    session.query("CREATE KEYSPACE IF NOT EXISTS ks WITH REPLICATION = {'class' : 'SimpleStrategy', 'replication_factor' : 1}", &[]).await.unwrap();

    session
        .query(
            "CREATE TABLE IF NOT EXISTS ks.t (p timestamp PRIMARY KEY)",
            &[],
        )
        .await
        .unwrap();

    session
        .query("INSERT INTO ks.t (p) VALUES (?)", (i64::MAX,))
        .await
        .unwrap();

    session
        .query("INSERT INTO ks.t (p) VALUES (?)", (i64::MIN,))
        .await
        .unwrap();

    let ps: Vec<i64> = session
        .query("SELECT p FROM ks.t", &[])
        .await
        .unwrap()
        .rows_typed::<(i64,)>()
        .unwrap()
        .map(|parse_res| parse_res.unwrap())
        .map(|p: (i64,)| p.0)
        .collect();

    for p in ps {
        println!("p: {}", p)
    }
}
# Output:
p: 9223372036854775807
p: -9223372036854775808

OTOH cqlsh (which uses the python driver underneath) reports an overflow:

[cqlsh 6.0.9 | Scylla 5.2.3-0.20230608.ea08d409f155 | CQL spec 3.3.1 | Native protocol v4]
Use HELP for help.
cqlsh> SELECT p FROM ks.t;
/home/jancio/.local/lib/python3.11/site-packages/cqlsh/cqlsh.py:403: DateOverFlowWarning: Some timestamps are larger than Python datetime can represent. Timestamps are displayed in milliseconds from epoch.

 p
----------------------
  9223372036854775807
 -9223372036854775808

(2 rows)

The CQL spec defines timestamp as an 8 byte signed integer, so IMO both of these values should be legal: https://github.com/apache/cassandra/blob/7750a9430bf8cec73c25e302642a3b0b77a78703/doc/native_protocol_v4.spec#L946-L950

To me it doesn't look like a Scylla bug. Scylla correctly writes and reads timestamp values, it's the driver that complains about "out of range values", so the bug is in the driver.

/cc @avelanarius

fruch commented 1 year ago

To me it doesn't look like a Scylla bug. Scylla correctly writes and reads timestamp values, it's the driver that complains about "out of range values", so the bug is in the driver.

/cc @avelanarius

It's not the first time we are running into such a warning, since in some cases it surfaced a real scylla issue (like using the wrong clock), I would argue we shouldn't remove the warning from cqlsh/python-driver

In this case I do agree it's not scylla core fault and caused by a change in scylla-bench, one if those were already fixed in scylla-bench,

if a user puts a timestamp of 292471209 years into the future, I think he would like to know it...

nyh commented 1 year ago

The same problem was reported in Cassandra eight years ago - see https://issues.apache.org/jira/browse/CASSANDRA-10625. They fixed cqlsh (see https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-10625) to print the warning that @cvybhu saw above instead of failing, but didn't fix the driver. Maybe there are additional cases (not just cqlsh...) that need to be fixed?

But this issue should probably be reopened in the driver repository, not here. I think Scylla's handling of the "timestamp" column type isn't the problem here.

fruch commented 1 year ago

The same problem was reported in Cassandra eight years ago - see https://issues.apache.org/jira/browse/CASSANDRA-10625. They fixed cqlsh (see apache/cassandra@trunk...pauloricardomg:trunk-10625) to print the warning that @cvybhu saw above instead of failing, but didn't fix the driver. Maybe there are additional cases (not just cqlsh...) that need to be fixed?

But this issue should probably be reopened in the driver repository, not here. I think Scylla's handling of the "timestamp" column type isn't the problem here.

I agree it's should be move to python-driver (and might be a problem also on other drivers that convert to date objects) also we probably would need to implement same workaround as in https://github.com/apache/cassandra/commit/70c8a53de4881a3ccbcf5df7a68f44a57b103f12, in SCT if not on the driver level, cause this is where we see the issue.

nyh commented 8 months ago

This is a real bug in the Python driver, that has been known for nine years (since https://issues.apache.org/jira/browse/CASSANDRA-10625) and unfortunately instead of fixing it we ended up hacking cqlsh and now gemini to work around it. But it just needs to be fixed.

Basically, if d is a date column and I do

INSERT INTO {table1} (p, d) VALUES ({p}, {2**30})
SELECT totimestamp(d) FROM {table1} WHERE p={p}

The timestamp returned by Scylla for totimestamp(d) is valid, but the Scylla driver doesn't agree to accept it from the server (!), saying:

cassandra.DriverException: Failed decoding result column "system.totimestamp(d)" of type timestamp: days=-1073741824; must have magnitude <= 999999999

(note that the date 2**30 means 2^30 days before the epoch (since the epoch is taken to be day 2^31) , and this is exactly that number -1073741824 which the driver doesn't like, although it is a perfectly valid (albeit non-useful) timestamp.

nyh commented 8 months ago

Unfortunately, it a timestamp of more than 999999999 days isn't the only problem, even lower-magnitude timestamps cause problems. I tried in the above example to set d to 2**31 - 2**29, i.e, "just" 2**29 days before the epoch, -536870912 days. So it's not more than 999999999 and we don't reach that error, but the driver still fails:

cassandra/cluster.py:2726: in cassandra.cluster.Session.execute
    ???
cassandra/cluster.py:5085: in cassandra.cluster.ResponseFuture.result
    ???
cassandra/connection.py:1267: in cassandra.connection.Connection.process_msg
    ???
cassandra/protocol.py:1212: in cassandra.protocol._ProtocolHandler.decode_messag
e
    ???
cassandra/protocol.py:760: in cassandra.protocol.ResultMessage.recv_body
    ???
cassandra/protocol.py:746: in cassandra.protocol.ResultMessage.recv
    ???
cassandra/row_parser.pyx:46: in cassandra.row_parser.make_recv_results_rows.recv_results_rows
    ???
cassandra/obj_parser.pyx:57: in cassandra.obj_parser.TupleRowParser.unpack_row
    ???
>   ???
E   cassandra.DriverException: Failed decoding result column "system.totimestamp(d)" of type timestamp: date value out of range

from:


>   ???

cassandra/obj_parser.pyx:72: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???

cassandra/deserializers.pxd:41: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???

cassandra/deserializers.pyx:143: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   OverflowError: date value out of range

cassandra/cython_utils.pyx:62: OverflowError
sylwiaszunejko commented 7 months ago

I believe these errors are present because we overflow python datetime.[MIN|MAX]. For datetime.date we have a wrapper to address that https://github.com/scylladb/python-driver/commit/4f3c77c970e10774243e7ad7baed2d41c06c4790 now we need something similar for datetime.datetime. I am actively working on it.

Lorak-mmk commented 6 months ago

The fix proposed in https://github.com/scylladb/python-driver/pull/310 is a breaking change that may break user code - which we should not do. We could do a major-version bump but that would make syncing with upstream even more problematic. If we were to do a breaking release at some point I think we should include more changes there, not only this one. PR was proposed to upstream, we'll see what they have to say about it - if they merge it then we'll receive it at some point when merging upstream commits.

Given that the issue is not a big problem, unlikely to be an issue for any client (as it is just an unlikely edge-case) and fixing it creates other problems I think we can let upstream deal with it and close the PR and issue (as wontfix) unless someone has any objections.

sylwiaszunejko commented 5 months ago

We are not merging the fix in #310 to not break backwards compatibility, but if someone needs datetimes outside datetime.[MIN|MAX]YEAR feel free to cherry-pick.