tensorbase / tensorbase

TensorBase is a new big data warehousing with modern efforts.
https://tensorbase.io/
Apache License 2.0
1.44k stars 117 forks source link

`select 'abc'` cause the server stuck #230

Open frank-king opened 3 years ago

frank-king commented 3 years ago

In the current version, doing queries like select 'abc' returns no result:

TensorBase :) select 'abc'

SELECT 'abc'

Ok.

0 rows in set. Elapsed: 0.008 sec. 

And I commented out this three lines in engine::datafusions: https://github.com/tensorbase/tensorbase/blob/14e4802b9c5e9e6e9543f80380213da1fb1c56cf/crates/engine/src/datafusions.rs#L177-L180 The server will get stuck on this query

TensorBase :) select 'abc'

SELECT 'abc'

^C

However, integral literals can work:

TensorBase :) select 1

SELECT 1

┌─Int64(1)─┐
│        1 │
└──────────┘

1 rows in set. Elapsed: 0.005 sec. 
jinmingjian commented 3 years ago

@whjpji thanks, this seems string literal handling problem. Let's fix!

frank-king commented 3 years ago

This is caused by https://github.com/tensorbase/tensorbase/commit/fee911c0b97ea1f75b1ca1bc50e58eee93c911b1#diff-7e3045dfedcbf998dd7fe82a9c43989c11678cd970c88b571f82a92284aaac16 (see below). It works if I revert the patch.

@jinmingjian Could you please tell me the reason why this change is undid after introduced by https://github.com/tensorbase/tensorbase/commit/8657b531c280d956ff6aa3b037c977848e4efb58#diff-d39e44a629780cf9d813c70da62a65754b9e39c69cba1cb454c6a84a6849faa4?

If it was not a mistake during refactoring, I guess it breaks the original logics dealing with strings in A-DF. Should we have a discussion of how to deal with the strings which differ in encoding between CH and A-DF?

diff --git a/arrow-datafusion/datafusion/src/logical_plan/expr.rs b/crates/datafusion/src/logical_plan/expr.rs
rename from arrow-datafusion/datafusion/src/logical_plan/expr.rs
rename to crates/datafusion/src/logical_plan/expr.rs
--- a/arrow-datafusion/datafusion/src/logical_plan/expr.rs  (revision ca023f93d64d13ba36df89301faaa4c79fffeec6)
+++ b/crates/datafusion/src/logical_plan/expr.rs    (revision fee911c0b97ea1f75b1ca1bc50e58eee93c911b1)
@@ -1061,23 +1061,13 @@

 impl Literal for &str {
     fn lit(&self) -> Expr {
-        //FIXME debug_assert!(self.len()<128);
-        let mut s = String::new();
-        debug_assert!(self.len()<128);
-        s.push(self.len() as u8 as char);
-        s.push_str(self);
-        Expr::Literal(ScalarValue::LargeUtf8(Some(s)))
+        Expr::Literal(ScalarValue::LargeUtf8(Some((*self).to_owned())))
     }
 }

 impl Literal for String {
     fn lit(&self) -> Expr {
-        //FIXME debug_assert!(self.len()<128);
-        let mut s = String::new();
-        debug_assert!(self.len()<128);
-        s.push(self.len() as u8 as char);
-        s.push_str(self);
-        Expr::Literal(ScalarValue::LargeUtf8(Some(s)))
+        Expr::Literal(ScalarValue::LargeUtf8(Some((*self).to_owned())))
     }
 }
jinmingjian commented 3 years ago

@whjpji sorry, this addition is the first hack-in. But for the latter, I remove these addition. That's we should not modify the string in this place.