Closed js00070 closed 4 years ago
Thanks for your contribution. If your PR get merged, you will be rewarded 1200 points.
Is it ok that I conintue to implement other IN
functions in this PR? Or maybe I should split those functions into different PRs? @winoros
@js00070 You can split them into multiple prs. Since they are mostly independent codes, putting them into one single pr is ok to me though. Seems that the PCP robot cannot deal with multiple pr correctly. You can put them into one pr first.
Also, you can run make test
or make dev
to run some checks in your local env.
Merging #13594 into master will not change coverage. The diff coverage is
n/a
.
@@ Coverage Diff @@
## master #13594 +/- ##
=========================================
Coverage 80.449% 80.449%
=========================================
Files 483 483
Lines 123452 123452
=========================================
Hits 99316 99316
Misses 16355 16355
Partials 7781 7781
the bench result is here:
for inInt
const count = 1 , threshold = 1
BenchmarkVectorizedBuiltinOtherFunc/builtinInIntSig-VecBuiltinFunc-12 111224 10272 ns/op 1024 B/op 1 allocs/op
BenchmarkVectorizedBuiltinOtherFunc/builtinInIntSig-NonVecBuiltinFunc-12 62086 18713 ns/op 0 B/op 0 allocs/op
const count = 1 , threshold = 200
BenchmarkVectorizedBuiltinOtherFunc/builtinInIntSig-VecBuiltinFunc-12 52753 22356 ns/op 1024 B/op 1 allocs/op
BenchmarkVectorizedBuiltinOtherFunc/builtinInIntSig-NonVecBuiltinFunc-12 16131 74576 ns/op 0 B/op 0 allocs/op
for inStr
count=1, threshold=100
BenchmarkVectorizedBuiltinOtherFunc/builtinInStringSig-VecBuiltinFunc-12 15488 76004 ns/op 1024 B/op 1 allocs/op
BenchmarkVectorizedBuiltinOtherFunc/builtinInStringSig-NonVecBuiltinFunc-12 13756 85849 ns/op 0 B/op 0 allocs/op
count=1, threshold=1
BenchmarkVectorizedBuiltinOtherFunc/builtinInStringSig-VecBuiltinFunc-12 69493 16724 ns/op 1024 B/op 1 allocs/op
BenchmarkVectorizedBuiltinOtherFunc/builtinInStringSig-NonVecBuiltinFunc-12 45828 25678 ns/op 0 B/op 0 allocs/op
so the threshold for inInt
and inStr
is 1
.
for inTime
count=1, threshold=100
BenchmarkVectorizedBuiltinOtherFunc/builtinInTimeSig-VecBuiltinFunc-12 57450 19978 ns/op 1024 B/op 1 allocs/op
BenchmarkVectorizedBuiltinOtherFunc/builtinInTimeSig-NonVecBuiltinFunc-12 23920 49055 ns/op 0 B/op 0 allocs/op
count=1, threshold=1
BenchmarkVectorizedBuiltinOtherFunc/builtinInTimeSig-VecBuiltinFunc-12 44167 26372 ns/op 1024 B/op 1 allocs/op
BenchmarkVectorizedBuiltinOtherFunc/builtinInTimeSig-NonVecBuiltinFunc-12 29161 40562 ns/op 0 B/op 0 allocs/op
count=2, threshold=1
BenchmarkVectorizedBuiltinOtherFunc/builtinInTimeSig-VecBuiltinFunc-12 44262 26342 ns/op 1024 B/op 1 allocs/op
BenchmarkVectorizedBuiltinOtherFunc/builtinInTimeSig-NonVecBuiltinFunc-12 29365 40589 ns/op 0 B/op 0 allocs/op
count=2, threshold=100
BenchmarkVectorizedBuiltinOtherFunc/builtinInTimeSig-VecBuiltinFunc-12 33315 35780 ns/op 1024 B/op 1 allocs/op
BenchmarkVectorizedBuiltinOtherFunc/builtinInTimeSig-NonVecBuiltinFunc-12 15873 76252 ns/op 0 B/op 0 allocs/op
so the threshold for inTime
is 2
.
/run-all-tests
Seems that there're some integration tests failed. I'll look into it first.
in Common Test / Mysql Test
[2019-11-24T14:28:57.579Z] 2019/11/24 22:28:57.487 [fatal] run test [json_gcol] err: sql:explain select f1 from t1 where json_extract(f1,"$") in ("asd","asasas","asdf");: run "explain select f1 from t1 where json_extract(f1,"$") in ("asd","asasas","asdf");" at line 76 err Error 3140: Invalid JSON text: The document root must not be followed by other values.
[2019-11-24T14:28:57.579Z] + echo 'tidb-server(PID: 148214) stopped'
Seems the sql should be as follow?
select f1 from t1 where json_extract(f1,"$") in ("\"asd\"","\"asasas\"","\"asdf\"");
It seems there is a incompatible issue with mysql about JSON funciton. here is the phenomenon:
in MySQL:8
mysql> select json_extract("\"asd\"","$") in ("asd","abcd");
+-----------------------------------------------+
| json_extract("\"asd\"","$") in ("asd","abcd") |
+-----------------------------------------------+
| 1 |
+-----------------------------------------------+
1 row in set (0.00 sec)
in MySQL:5.7
mysql> select json_extract("\"asd\"","$") in ("asd","abcd");
+----------------------------------------------+
| json_extract("\"asd\"","$") in ("asd","abc") |
+----------------------------------------------+
| 0 |
+----------------------------------------------+
1 row in set, 1 warning (0.00 sec)
in TiDB master branch (v4.0.0-alpha-940-g92e774913, commit hash: 92e774913a2f6166b104d3d8eaed5513733f1830)
MySQL [(none)]> select json_extract("\"asd\"","$") in ("asd","abcd");
ERROR 3140 (22032): Invalid JSON text: The document root must not be followed by other values.
MySQL [(none)]>
I submit an issue here https://github.com/pingcap/tidb/issues/13710
/run-sqllogic-test
@js00070 The following is one failed case in sql logic test.
CREATE TABLE tab0(col0 INTEGER, col1 INTEGER, col2 INTEGER);
CREATE TABLE tab1(col0 INTEGER, col1 INTEGER, col2 INTEGER);
CREATE TABLE tab2(col0 INTEGER, col1 INTEGER, col2 INTEGER);
INSERT INTO tab0 VALUES(97,1,99);
INSERT INTO tab0 VALUES(15,81,47);
INSERT INTO tab0 VALUES(87,21,10);
INSERT INTO tab1 VALUES(51,14,96);
INSERT INTO tab1 VALUES(85,5,59);
INSERT INTO tab1 VALUES(91,47,68);
INSERT INTO tab2 VALUES(64,77,40);
INSERT INTO tab2 VALUES(75,67,58);
INSERT INTO tab2 VALUES(46,51,23);
expected ["-37"], but found [] - sql[ SELECT DISTINCT col1 DIV - - col1 + - 38 AS col1 FROM tab0 AS cor0 WHERE - col1 * col0 DIV - - col2 IN ( - 2, - col1 DIV + col2 ) ]
/run-sqllogic-test
@js00070 The data for the first failed case:
CREATE TABLE tab0(col0 INTEGER, col1 INTEGER, col2 INTEGER);
CREATE TABLE tab1(col0 INTEGER, col1 INTEGER, col2 INTEGER);
CREATE TABLE tab2(col0 INTEGER, col1 INTEGER, col2 INTEGER);
INSERT INTO tab0 VALUES(97,1,99);
INSERT INTO tab0 VALUES(15,81,47);
INSERT INTO tab0 VALUES(87,21,10);
INSERT INTO tab1 VALUES(51,14,96);
INSERT INTO tab1 VALUES(85,5,59);
INSERT INTO tab1 VALUES(91,47,68);
INSERT INTO tab2 VALUES(64,77,40);
INSERT INTO tab2 VALUES(75,67,58);
INSERT INTO tab2 VALUES(46,51,23);
The data for another two failed case:
CREATE TABLE tab0(col0 INTEGER, col1 INTEGER, col2 INTEGER);
CREATE TABLE tab1(col0 INTEGER, col1 INTEGER, col2 INTEGER);
CREATE TABLE tab2(col0 INTEGER, col1 INTEGER, col2 INTEGER);
INSERT INTO tab0 VALUES(97,1,99);
INSERT INTO tab0 VALUES(15,81,47);
INSERT INTO tab0 VALUES(87,21,10);
INSERT INTO tab1 VALUES(51,14,96);
INSERT INTO tab1 VALUES(85,5,59);
INSERT INTO tab1 VALUES(91,47,68);
INSERT INTO tab2 VALUES(64,77,40);
INSERT INTO tab2 VALUES(75,67,58);
INSERT INTO tab2 VALUES(46,51,23);
/run-sqllogic-test
this fail case is very werid.
[error] /git/sqllogictest/test/index/random/10/slt_good_2.test:20581: expected [], but found ["0" "356" "346" "0" "720" "678" "0" "1" "84" "398" "0" "573" "235" "0" "2" "525" "890" "0" "690" "465" "0" "3" "43" "9" "0" "223" "197" "0" "4" "606" "129" "0" "989" "259" "0" "5" "912" "442" "0" "596" "110" "0" "6" "512" "576" "0" "568" "457" "0" "7" "933" "226" "0" "471" "886" "0" "8" "810" "775" "0" "36" "485" "0" "9" "3" "921" "0" "284" "468" "0"] - sql[ SELECT * FROM tab0 AS cor0 WHERE + col1 NOT IN ( - 82, - col1, 96, col1, CAST( - ( + + 30 ) AS SIGNED ) ) ]
where col1 not in (..., col1, ... )
will always be false
. it seems impossible to fail on this case. Could you show me the data of this case? @winoros
@js00070 Sorry for late reply Here's the data
CREATE TABLE tab0(pk INTEGER PRIMARY KEY, col0 INTEGER, col1 FLOAT, col2 TEXT, col3 INTEGER, col4 FLOAT, col5 TEXT);
INSERT INTO tab0 VALUES(0,356,346.29,'emyjj',720,678.28,'mylwf');
INSERT INTO tab0 VALUES(1,84,398.55,'sbaji',573,235.64,'dgkbk');
INSERT INTO tab0 VALUES(2,525,890.27,'cbpha',690,465.9,'toihb');
INSERT INTO tab0 VALUES(3,43,9.3,'bqwdh',223,197.28,'vrkrw');
INSERT INTO tab0 VALUES(4,606,129.87,'axjxg',989,259.90,'ylzxx');
INSERT INTO tab0 VALUES(5,912,442.38,'uyrso',596,110.71,'gtdhg');
INSERT INTO tab0 VALUES(6,512,576.81,'cyvvl',568,457.76,'erkxs');
INSERT INTO tab0 VALUES(7,933,226.24,'fkgxz',471,886.45,'bqisj');
INSERT INTO tab0 VALUES(8,810,775.69,'sfbbw',36,485.66,'tlesg');
INSERT INTO tab0 VALUES(9,3,921.21,'wsyzc',284,468.90,'hpkbx');
CREATE TABLE tab1(pk INTEGER PRIMARY KEY, col0 INTEGER, col1 FLOAT, col2 TEXT, col3 INTEGER, col4 FLOAT, col5 TEXT);
CREATE INDEX idx_tab1_0 on tab1 (col0);
CREATE INDEX idx_tab1_1 on tab1 (col1);
CREATE INDEX idx_tab1_3 on tab1 (col3);
CREATE INDEX idx_tab1_4 on tab1 (col4);
INSERT INTO tab1 SELECT * FROM tab0;
CREATE TABLE tab2(pk INTEGER PRIMARY KEY, col0 INTEGER, col1 FLOAT, col2 TEXT, col3 INTEGER, col4 FLOAT, col5 TEXT);
CREATE INDEX idx_tab2_0 ON tab2 (col1,col0,col4);
CREATE INDEX idx_tab2_1 ON tab2 (col1,col0 DESC,col3 DESC);
CREATE INDEX idx_tab2_2 ON tab2 (col4);
CREATE UNIQUE INDEX idx_tab2_3 ON tab2 (col0);
CREATE INDEX idx_tab2_4 ON tab2 (col3);
INSERT INTO tab2 SELECT * FROM tab0
CREATE TABLE tab3(pk INTEGER PRIMARY KEY, col0 INTEGER, col1 FLOAT, col2 TEXT, col3 INTEGER, col4 FLOAT, col5 TEXT);
CREATE UNIQUE INDEX idx_tab3_0 ON tab3 (col1);
CREATE INDEX idx_tab3_2 ON tab3 (col4);
CREATE INDEX idx_tab3_5 ON tab3 (col0 DESC);
INSERT INTO tab3 SELECT * FROM tab0;
CREATE TABLE tab4(pk INTEGER PRIMARY KEY, col0 INTEGER, col1 FLOAT, col2 TEXT, col3 INTEGER, col4 FLOAT, col5 TEXT);
CREATE UNIQUE INDEX idx_tab4_0 ON tab4 (col1,col4,col3 DESC);
CREATE INDEX idx_tab4_1 ON tab4 (col1,col0);
CREATE UNIQUE INDEX idx_tab4_5 ON tab4 (col4);
INSERT INTO tab4 SELECT * FROM tab0;
/run-sqllogic-test
/run-all-tests
this CI failure is related to https://github.com/pingcap/tidb/issues/13710
[fatal] run test [json_gcol] err: sql:explain select f1 from t1 where json_extract(f1,"$") in ("asd","asasas","asdf");: run "explain select f1 from t1 where json_extract(f1,"$") in ("asd","asasas","asdf");" at line 76 err Error 3140: Invalid JSON text: The document root must not be followed by other values.
I will make another PR to solve it.
Great! I'll begin to review this pr tonight.
/run-all-tests
Friendly ping @js00070
Friendly ping @js00070
I'm a little busy this week. I will update the code this weekend.
/run-all-tests
/run-all-tests
/run-all-tests
Hi, please help take a look at the integration test failure.
[2019-12-15T10:33:26.909Z] 2019/12/15 18:33:26.881 [fatal] run test [json_gcol] err: sql:explain select f1 from t1 where json_extract(f1,"$") in ("asd","asasas","asdf");: run "explain select f1 from t1 where json_extract(f1,"$") in ("asd","asasas","asdf");" at line 76 err Error 3140: Invalid JSON text: The document root must not be followed by other values.
Hi, please help take a look at the integration test failure.
[2019-12-15T10:33:26.909Z] 2019/12/15 18:33:26.881 [fatal] run test [json_gcol] err: sql:explain select f1 from t1 where json_extract(f1,"$") in ("asd","asasas","asdf");: run "explain select f1 from t1 where json_extract(f1,"$") in ("asd","asasas","asdf");" at line 76 err Error 3140: Invalid JSON text: The document root must not be followed by other values.
this CI failure is because of this issue https://github.com/pingcap/tidb/issues/13710 I will fix it in another PR
/run-all-tests
/run-all-tests
/run-all-tests
/run-all-tests
/run-all-tests
/run-all-tests
PTAL @SunRunAway the workaround for https://github.com/pingcap/tidb/issues/13710 is conflict to the test case......
[2019-12-18T14:12:23.484Z] "explain select f1 from t1 where json_extract(f1,"$") in ("asd","asasas","asdf");"
[2019-12-18T14:12:23.484Z] around line 76,
[2019-12-18T14:12:23.484Z] we need(354):
[2019-12-18T14:12:23.484Z] explain select f1 from t1 where json_extract(f1,"$") in ("asd","asasas","asdf");
[2019-12-18T14:12:23.484Z] id count task operator info
[2019-12-18T14:12:23.484Z] Selection_5 8000.00 root in(json_extract(json_gcol.t1.f1, "$"), cast("asd"), cast("asasas"), cast("asdf"))
[2019-12-18T14:12:23.484Z] └─TableReader_7 10000.00 root data:TableScan_6
[2019-12-18T14:12:23.484Z] └─TableScan_6 10000.00 cop[tikv] table:t1, range:[-inf,+inf], keep order:false, s
[2019-12-18T14:12:23.484Z] but got(354):
[2019-12-18T14:12:23.484Z] explain select f1 from t1 where json_extract(f1,"$") in ("asd","asasas","asdf");
[2019-12-18T14:12:23.484Z] id count task operator info
[2019-12-18T14:12:23.484Z] Selection_5 8000.00 root in(cast(json_extract(json_gcol.t1.f1, "$")), "asd", "asasas", "asdf")
[2019-12-18T14:12:23.484Z] └─TableReader_7 10000.00 root data:TableScan_6
[2019-12-18T14:12:23.484Z] └─TableScan_6 10000.00 cop[tikv] table:t1, range:[-inf,+inf], keep order:false, stats:pseudo
@SunRunAway It seems that the current version of tidb ignore the eps, and this PR keep the bug. I think that we can merge this PR, and file a new issue to track about real comparison compatibility with MySQL.
@SunRunAway It seems that the current version of tidb ignore the eps, and this PR keep the bug. I think that we can merge this PR, and file a new issue to track about real comparison compatibility with MySQL.
Agree with you. @js00070 Could you modify you PR and remove the JSON part, and help create a new issue to track real comparison compatibility with Hash Join executor and Hash In function?
Let's make this PR be merged.
@SunRunAway It seems that the current version of tidb ignore the eps, and this PR keep the bug. I think that we can merge this PR, and file a new issue to track about real comparison compatibility with MySQL.
Agree with you. @js00070 Could you modify you PR and remove the JSON part, and help create a new issue to track real comparison compatibility with Hash Join executor and Hash In function?
Let's make this PR be merged.
ok. And How do we handle the CI failure that we discussed last week?
@SunRunAway It seems that the current version of tidb ignore the eps, and this PR keep the bug. I think that we can merge this PR, and file a new issue to track about real comparison compatibility with MySQL.
Agree with you. @js00070 Could you modify you PR and remove the JSON part, and help create a new issue to track real comparison compatibility with Hash Join executor and Hash In function? Let's make this PR be merged.
ok. And How do we handle the CI failure that we discussed last week?
We can remove the JSON implementation, so that the CI will pass. Then we handle the JSON compatibility issue in #13710.
/run-all-tests
/run-all-tests
@winoros, @SunRunAway, @TennyZhuang, @breeswish, @qw4990, @XuHuaiyu, @wshwsh12, PTAL.
@winoros, @SunRunAway, @TennyZhuang, @breeswish, @qw4990, @XuHuaiyu, @wshwsh12, PTAL.
@js00070, please update your pull request.
@winoros, @SunRunAway, @TennyZhuang, @breeswish, @qw4990, @XuHuaiyu, @wshwsh12, PTAL.
@winoros, @SunRunAway, @TennyZhuang, @breeswish, @XuHuaiyu, @qw4990, @wshwsh12, PTAL.
@winoros, @SunRunAway, @TennyZhuang, @breeswish, @XuHuaiyu, @qw4990, @wshwsh12, PTAL.
/run-all-tests
/run-all-tests
PCP #12970
What problem does this PR solve?
optimize function
IN
using hashmap according to #12970What is changed and how it works?
have implemented
InInt
currently, will continue to implement otherIN
functions.Check List
Tests
Side effects