AlexPikalov / cdrs

Cassandra DB native client written in Rust language. Find 1.x versions on https://github.com/AlexPikalov/cdrs/tree/v.1.x Looking for an async version? - Check WIP https://github.com/AlexPikalov/cdrs-async
Apache License 2.0
342 stars 58 forks source link

Epoch timestamps are being modified by the driver before being stored in Cassandra DB #236

Closed MaenMax closed 5 years ago

MaenMax commented 5 years ago

For example:

suppose that there is a table in Cassandra called tableA and it has two columns: ID column and timestamp column.

id is INT timestamp is DECIMAL

When executing those lines of code:

let update_struct_cql: String = format!("update tableA SET  timestamp = ? where id = ?;");

ddb.query_with_values(update_struct_cql,query_values!(current_time, id)).expect("[Err]: Could not update table");  

When I print out timestamp BEFORE it is being used in above query, it will print something normal, like 1545870134, however, when I check column value in Cassandra DB, I see something like:

6.1805539111220E-825570344

The question is: What happened to the timestamp ?

AlexPikalov commented 5 years ago

Hi @MaenMax ,

First of all thank you for your question. If your timestamp is decimal the driver converts it into decimal data type defined in Cassandra Native Protocol spec https://github.com/apache/cassandra/blob/trunk/doc/native_protocol_v4.spec#L863

The decimal format represents an arbitrary-precision number. It contains an [int] "scale" component followed by a varint encoding (see section 6.17) of the unscaled value. The encoded value represents "unscaledE-scale". In other words, "unscaled 10 ^ (-1 scale)".

The string format is more or less in agreement with the spec.

PS Feel free to create an issue if you have any problems with actual values and how it's been converted.

MaenMax commented 5 years ago

Thanks Alex for your reply.

I want to save it as it is in DB. Do you know how to save it so that it appear the same in Cassandra without the conversion? I don't want to change the data type of Cassandra column, I would like to keep it as decimal, but how to save something like "1545870134" into a decimal column type without being converted as above ?

AlexPikalov commented 5 years ago

I can try to modify an algorithm of encoding so that for integer numbers (numbers without decimal point) the value will be without E-XXX, but just 1545870134.

I'll try to prepare the PR for this change as soon as I can. Will keep you updated.

MaenMax commented 5 years ago

Thanks!

AlexPikalov commented 5 years ago

@MaenMax Thanks again for creating this issue. After deep debugging I've figured out that decimals encoding / decoding was mis implemented. Now it's fixed in #237

MaenMax commented 5 years ago

Thanks Alex, I will fetch your changes into my code. I guess I will need to modify the version in Cargo.toml file.

AlexPikalov commented 5 years ago

Yes. You'll need to modify a version of CDRS to have beta.5. As well please refer to this test in order to figure out how to let the driver know that your number should be encoded as a decimal https://github.com/AlexPikalov/cdrs/blob/master/tests/native_types.rs#L191

let values = query_values!(my_float, my_double, Decimal::from(my_decimal));
MaenMax commented 5 years ago

Okay

MaenMax commented 5 years ago

Hi Alex, I just tested with Beta 5 version, and I am still seeing the same result. Please refer to the attachment. last_connect && connected at columns are filled with wrong timestamp format. print-screen

AlexPikalov commented 5 years ago

Was this value inserted with beta.5 version of a driver?

AlexPikalov commented 5 years ago
screen shot 2019-01-02 at 22 19 48

This is what I have after running e2e tests with decimal values.

Also please check CDRS's version in your lock-file to make sure it was actually updated.

MaenMax commented 5 years ago

Hi Alex,

Yes, it is with beta 5 version, and lock file is updated. I saw in your example above that you have not tried to save a timestamp value into "my_decial" column.

My problem is not with smal values like 120.01, the problem is that I would like to save an epoch timestamp value into "my_decimal" column. Epoch timestamp is a 10 digit number, so a timestamp like 1545870134 is recognized by Rust as one billion, five hundred forty-five million, eight hundred seventy thousand, one hundred thirty-four.

When I try to save smaller numbers just like the 120.01 in your example, there will be no any issue. It is just when I try to save big numbers (like in billions) and so on.

AlexPikalov commented 5 years ago

I see. I guess I was able to reproduce the issue. I'm re-opening it

MaenMax commented 5 years ago

Hi Alex, Any updates on this issue ?

Thanks

AlexPikalov commented 5 years ago

I've some solution, but need to make some checks before merging it.

MaenMax commented 5 years ago

Hi Alex, Thanks for your changes. I am testing again.

AlexPikalov commented 5 years ago

I've just found that the proposed approach doesn't work quite well with float numbers. I'll try to leverage https://github.com/paupino/rust-decimal. It's seems to be very close to what we're looking for

MaenMax commented 5 years ago

Thanks Alex for looking deeply into this. It is really important as I am developing a backend system which will support over 50 Million users. I can not change the format of table columns of production environment to something other than Decimal (like INT). If it is really challenging to save timestamps in a column of Decimal type, kindly let me know so that I plan ahead.

AlexPikalov commented 5 years ago

With https://github.com/paupino/rust-decimal it's doable, I just need some time to make it a good citizen of CDRS types system.

Will keep you updated in this thread

MaenMax commented 5 years ago

Sure. Thanks for the update.

AlexPikalov commented 5 years ago

Hi @MaenMax ,

I've provided an implementation that works fine with both integers and float numbers of different ranges. Please check it from your end.

Thanks

MaenMax commented 5 years ago

Hi Alex,

Thanks for your update. What would be the version to use ? May you kindly provide the Cargo.toml dependency which will reflect the change.

AlexPikalov commented 5 years ago

For testing purposes just to check that proposed solution works for your case you can use a reference to the branch:

cdrs = { git = "https://github.com/AlexPikalov/cdrs" branch = "fix/decimal-for-big-numbers" }

After these check a new version will be published and you'll be able to put a permanent version (I'll provide you with it after publishing)

MaenMax commented 5 years ago

I updated Cargo file, and ran: cargo build, but I got this strange error:

' Updating git repository https://github.com/AlexPikalov/cdrs Updating crates.io index
error: failed to select a version for byteorder. ... required by package cdrs v2.0.0-beta.5 (https://github.com/AlexPikalov/cdrs?branch=fix/decimal-for-big-numbers#daeadf2f) ... which is depended on by autopush v1.52.0 (/home/maen/work/dev/rustcep) versions that meet the requirements = 1.2.7 are: 1.2.7

all possible versions conflict with previously selected packages.

previously selected package byteorder v1.2.6 ... which is depended on by base64 v0.9.3 ... which is depended on by autopush v1.52.0 (/home/maen/work/dev/rustcep)

AlexPikalov commented 5 years ago

Very strange. Please try to remove cargo.lock and re-install deps. If it didn't help I'll revert byteorder version

MaenMax commented 5 years ago

I did. Looks like to is compiling. Will update you on test results shortly.

MaenMax commented 5 years ago

Still the same behaviour screen_shot

AlexPikalov commented 5 years ago

@MaenMax Could you please provide me with exact numbers that lead to those values?

As well please make sure the values are wrapped into Decimal before making a query:

let values = query_values!(
// either
        Decimal::new(12001, 2), // float like number with decimal point 120.01
// or
        Decimal::from(my_decimal_b) // integer number
    );

session
        .query_with_values(query, values)
        .expect("insert numbers");

For me the solution works with float numbers as well as with big numbers up to std::i64::MAX

MaenMax commented 5 years ago

I tried but got an error:

I just wanted to use Decimal conversion only on "connected_at" columns

let connected_at: u64 = 1546998816

ddb.query_with_values(update_struct_cql,query_values!(current_month,Decimal::new(connected_at),last_connect,node_id, router_type, user_id)).expect("[Err]: Could not register user");

Error details:

[rustc] the trait bound cdrs::types::value::Bytes: std::convert::From<rust_decimal::Decimal> is not satisfied

AlexPikalov commented 5 years ago

@MaenMax

For values of type u64 the simplest way is to convert a value into i64 and then pass it into Decimal::from:

let connected_at: u64 = 1546998816;

ddb.query_with_values(update_struct_cql,query_values!(current_month,Decimal::from(connected_at as i64),last_connect,node_id, router_type, user_id)).expect("[Err]: Could not register user");

Please take a look on my_decimal_b column

screen shot 2019-01-09 at 08 18 38

it has exactly the same number as the one from your example

MaenMax commented 5 years ago

Did it with the same error:

trait std::convert::From<rust_decimal::Decimal> is not implemented for cdrs::types::value::Bytes

MaenMax commented 5 years ago

screenshot from 2019-01-09 14-21-11

AlexPikalov commented 5 years ago

Hi @MaenMax

I've just updated cdrs_helpers_derive crate in order to add a support for Decimal for derivable traits.

As well I've created a standalone example that shows a usage of CDRS for storing decimals https://github.com/AlexPikalov/cdrs_decimal_example. Please pay attention to dependencies as they include cdrs_helpers_derive = "0.3.0"

Here is what I have after running my example:

screen shot 2019-01-10 at 00 18 12 screen shot 2019-01-10 at 00 18 30

Please let me know it helps so I'll go ahead and merge current solution and then publish a new version of CDRS.

PS In order to obtain f64 from Decimal please use decimal_value.as_plain()

MaenMax commented 5 years ago

Hi Alex, It is so strange that your Cargo project cdrs_decimal_example did not compile on my machine. I just cloned the repo, and ran cargo build.

What Rust version do you use?

error[E0433]: failed to resolve. Use of undeclared type or module Decimal
--> src/main.rs:30:39
|
30 | #[derive(Clone, Debug, IntoCDRSValue, TryFromRow, PartialEq)]
| ^^^^^^^^^^ Use of undeclared type or module Decimal

error[E0433]: failed to resolve. Use of undeclared type or module Decimal
--> src/main.rs:61:18
|
61 | decimal: Decimal::from(1546998816i64),
| ^^^^^^^ Use of undeclared type or module Decimal

error[E0412]: cannot find type Decimal in this scope
--> src/main.rs:33:14
33 decimal: Decimal, ^^^^^^^ not found in this scope
help: possible candidate is found in another module, you can import it into scope
6 use cdrs::types::decimal::Decimal;

error: aborting due to 3 previous errors

Some errors occurred: E0412, E0433.
For more information about an error, try rustc --explain E0412.
error: Could not compile decimal_cdrs.

To learn more, run the command again with --verbose.

MaenMax commented 5 years ago

I use rustc 1.30.0

AlexPikalov commented 5 years ago

I didn't update lock file in the example after some changes on cdrs side. My bad. Could you please git pull the example and re-build? Now it should work.

MaenMax commented 5 years ago

Yaaaaaaaaaaaaaaay !

latest

AlexPikalov commented 5 years ago

Congrats! I'm glad I was able to help you.

Also thank you for helping me identifying the problem with encoding/decoding decimals.

Now, I'm going to merge and publish it so you'll be able to put cdrs dependency in a regular way by just prividing its version

MaenMax commented 5 years ago

Hi Alex, Thank you so much for your support on this. Kindly let me know when you publish. I believe it is going to be in beta 6, if I am not mistaking.

Maen

AlexPikalov commented 5 years ago

Hi @MaenMax,

It's my pleasure to help.

2.0.0-beta.6 has been just published so you can use it in your project.