kuzudb / kuzu

Embeddable property graph database management system built for query speed and scalability. Implements Cypher.
https://kuzudb.com/
MIT License
1.28k stars 90 forks source link

Bug: Decimal edge case failure #3949

Open acquamarin opened 1 month ago

acquamarin commented 1 month ago

Kùzu version

master

What operating system are you using?

macOS M1

What happened?

bug1.

kuzu> create node table person (id int64, a decimal (38,0), primary key(id));
kuzu> create (p:person {id: 2, a: 100000000000000000000000000000000000000.000000000000000000000000000000});
(0 tuples)
(0 columns)
Time: 0.21ms (compiling), 0.62ms (executing)
kuzu> match (p:person) return p.*;
┌───────┬────────────────────────────────────────┐
│ p.id  │ p.a                                    │
│ INT64 │ DECIMAL(38, 0)                         │
├───────┼────────────────────────────────────────┤
│ 5     │ 99999999999999999999999999999999999999 │
│ 2     │ 99999999999999997748809823456034029568 │
└───────┴────────────────────────────────────────┘

Should throw an exception since 100000000000000000000000000000000000000.000000000000000000000000000000 is out of range.

bug2.

kuzu> create (p:person {id: 22, a: -99999999999999999999999999999999999999.999999999999999999999999999999});
(0 tuples)
(0 columns)
Time: 0.34ms (compiling), 0.60ms (executing)
kuzu> match (p:person) return p.*;
┌───────┬─────────────────────────────────────────┐
│ p.id  │ p.a                                     │
│ INT64 │ DECIMAL(38, 0)                          │
├───────┼─────────────────────────────────────────┤
│ 5     │ 99999999999999999999999999999999999999  │
│ 2     │ 99999999999999997748809823456034029568  │
│ 22    │ -99999999999999997748809823456034029568 │
└───────┴─────────────────────────────────────────┘

should throw an exception as well

bug3. incorrect result when adding decimal with double: kuzu:

kuzu> return  cast(1 as decimal (32, 1)) + 9007199254740992.0000;
┌──────────────────────────────────────────────────────────────────┐
│ +(CAST(CAST(1, DECIMAL(32, 1)), DOUBLE),9007199254740992.000000) │
│ DOUBLE                                                           │
├──────────────────────────────────────────────────────────────────┤
│ 9007199254740992.000000                                          │
└──────────────────────────────────────────────────────────────────┘
(1 tuple)
(1 column)
Time: 0.07ms (compiling), 0.12ms (executing)

duckdb:

D select cast(1 as decimal (32, 1)) + 9007199254740992.0000;
┌────────────────────────────────────────────────────┐
│ (CAST(1 AS DECIMAL(32,1)) + 9007199254740992.0000) │
│                   decimal(36,4)                    │
├────────────────────────────────────────────────────┤
│                              9007199254740993.0000 │
└────────────────────────────────────────────────────┘

Are there known steps to reproduce?

No response

mxwli commented 1 month ago

The crux of the problem here is that literals are still interpreted as DOUBLE, whereas in DuckDB, they're interpreted as DECIMAL.

In bug 1 and 2, the issue is that there is then floating point imprecision when you attempt to cast the doubles to decimals.

In bug 3, the only reason DuckDB gives the correct result is because the arithmetic is DECIMAL arithmetic.

select cast(1 as decimal (32, 1)) + 9007199254740992.0000::double;

Gives the same result we give.

acquamarin commented 1 month ago

The crux of the problem here is that literals are still interpreted as DOUBLE, whereas in DuckDB, they're interpreted as DECIMAL. I think we should be smart about how we interpret those numbers.

mxwli commented 1 month ago

Changing the default interpretation to decimal breaks a lot of things. I'm not sure where to interpret the number as a decimal in this case. Perhaps when it comes to inserting into something with a known schema, we should interpret literals as a string and then apply casting.

Edit: Xiyang told me this solution is really slow. We may have to delay this fix until after the release.