Open frank-king opened 3 years ago
@whjpji thanks, this seems string literal handling problem. Let's fix!
This is caused by https://github.com/tensorbase/tensorbase/commit/fee911c0b97ea1f75b1ca1bc50e58eee93c911b1#diff-7e3045dfedcbf998dd7fe82a9c43989c11678cd970c88b571f82a92284aaac16 (see below). It works if I revert the patch.
@jinmingjian Could you please tell me the reason why this change is undid after introduced by https://github.com/tensorbase/tensorbase/commit/8657b531c280d956ff6aa3b037c977848e4efb58#diff-d39e44a629780cf9d813c70da62a65754b9e39c69cba1cb454c6a84a6849faa4?
If it was not a mistake during refactoring, I guess it breaks the original logics dealing with strings in A-DF. Should we have a discussion of how to deal with the strings which differ in encoding between CH and A-DF?
diff --git a/arrow-datafusion/datafusion/src/logical_plan/expr.rs b/crates/datafusion/src/logical_plan/expr.rs
rename from arrow-datafusion/datafusion/src/logical_plan/expr.rs
rename to crates/datafusion/src/logical_plan/expr.rs
--- a/arrow-datafusion/datafusion/src/logical_plan/expr.rs (revision ca023f93d64d13ba36df89301faaa4c79fffeec6)
+++ b/crates/datafusion/src/logical_plan/expr.rs (revision fee911c0b97ea1f75b1ca1bc50e58eee93c911b1)
@@ -1061,23 +1061,13 @@
impl Literal for &str {
fn lit(&self) -> Expr {
- //FIXME debug_assert!(self.len()<128);
- let mut s = String::new();
- debug_assert!(self.len()<128);
- s.push(self.len() as u8 as char);
- s.push_str(self);
- Expr::Literal(ScalarValue::LargeUtf8(Some(s)))
+ Expr::Literal(ScalarValue::LargeUtf8(Some((*self).to_owned())))
}
}
impl Literal for String {
fn lit(&self) -> Expr {
- //FIXME debug_assert!(self.len()<128);
- let mut s = String::new();
- debug_assert!(self.len()<128);
- s.push(self.len() as u8 as char);
- s.push_str(self);
- Expr::Literal(ScalarValue::LargeUtf8(Some(s)))
+ Expr::Literal(ScalarValue::LargeUtf8(Some((*self).to_owned())))
}
}
@whjpji sorry, this addition is the first hack-in. But for the latter, I remove these addition. That's we should not modify the string in this place.
In the current version, doing queries like
select 'abc'
returns no result:And I commented out this three lines in
engine::datafusions
: https://github.com/tensorbase/tensorbase/blob/14e4802b9c5e9e6e9543f80380213da1fb1c56cf/crates/engine/src/datafusions.rs#L177-L180 The server will get stuck on this queryHowever, integral literals can work: