4paradigm / OpenMLDB

OpenMLDB is an open-source machine learning database that provides a feature platform computing consistent features for training and inference.
https://openmldb.ai
Apache License 2.0
1.59k stars 321 forks source link

feat(udf): array_combine & array_join #3945

Closed aceforeverd closed 4 months ago

aceforeverd commented 5 months ago

Two UDFs added

array_combine

array_combine (delimiter, array1, ...)

return array of strings for input array1, array2, ... doing cartesian product. Each product is joined with {delimiter} as a string. Empty string used if {delimiter} is null.

select array_combine("-", ["1", "2"], ["3", "4"]);  -- ["1-3", "1-4", "2-3", "2-4"]

array_join

array_join(arr, delimiter)

Concatenates the elements of the given array using the delimiter. Any null value is filtered.

select array_join(["1", "2"], "-");  -- "1-2"
github-actions[bot] commented 5 months ago

SDK Test Report

102 files  ±0  102 suites  ±0   2m 17s :stopwatch: -14s 359 tests ±0  345 :white_check_mark: ±0  14 :zzz: ±0  0 :x: ±0  487 runs  ±0  473 :white_check_mark: ±0  14 :zzz: ±0  0 :x: ±0 

Results for commit a55d65ca. ± Comparison against base commit 59d79f6d.

This pull request removes 30 and adds 9 tests. Note that renamed tests count towards both. ``` PARTITION BY db1.t1.col2 ORDER BY db1.t1.col1 PARTITION BY t1.col2 ORDER BY t1.col1 ROWS_RANGE BETWEEN 3 PRECEDING AND CURRENT ROW ) limit 10;](1) ) limit 10;](2) ) limit 10;](3) FROM db1.t1 FROM t1 WINDOW w1 AS ( last join db2.t2 order by db2.t2.col1 … ``` ``` com._4paradigm.hybridse.sdk.SqlEngineTest ‑ sqlLastJoinWithMultipleDB[, SELECT sum(db1.t1.col1) over w1 as sum_t1_col1, db2.t2.str1 as t2_str1 FROM db1.t1 last join db2.t2 order by db2.t2.col1 on db1.t1.col1 = db2.t2.col1 and db1.t1.col2 = db2.t2.col0 WINDOW w1 AS ( PARTITION BY db1.t1.col2 ORDER BY db1.t1.col1 ROWS_RANGE BETWEEN 3 PRECEDING AND CURRENT ROW ) limit 10;](2) com._4paradigm.hybridse.sdk.SqlEngineTest ‑ sqlLastJoinWithMultipleDB[db1, SELECT sum(t1.col1) over w1 as sum_t1_col1, db2.t2.str1 as t2_str1 FROM t1 last join db2.t2 order by db2.t2.col1 on t1.col1 = db2.t2.col1 and t1.col2 = db2.t2.col0 WINDOW w1 AS ( PARTITION BY t1.col2 ORDER BY t1.col1 ROWS_RANGE BETWEEN 3 PRECEDING AND CURRENT ROW ) limit 10;](1) com._4paradigm.hybridse.sdk.SqlEngineTest ‑ sqlLastJoinWithMultipleDB[null, SELECT sum(db1.t1.col1) over w1 as sum_t1_col1, db2.t2.str1 as t2_str1 FROM db1.t1 last join db2.t2 order by db2.t2.col1 on db1.t1.col1 = db2.t2.col1 and db1.t1.col2 = db2.t2.col0 WINDOW w1 AS ( PARTITION BY db1.t1.col2 ORDER BY db1.t1.col1 ROWS_RANGE BETWEEN 3 PRECEDING AND CURRENT ROW ) limit 10;](3) com._4paradigm.hybridse.sdk.SqlEngineTest ‑ sqlMultipleDBErrorTest[, SELECT db2.t2.str1 as t2_str1 FROM t1 last join db2.t2 order by db2.t2.col1 on t1.col1 = db2.t2.col1 and t1.col2 = db2.t2.col0; , SQL parse error: Fail to transform data provider op: table t1 not exists in database []](4) com._4paradigm.hybridse.sdk.SqlEngineTest ‑ sqlMultipleDBErrorTest[db1, SELECT db1.t2.str1 as t2_str1 FROM t1 last join db2.t2 order by db2.t2.col1 on t1.col1 = db2.t2.col1 and t1.col2 = db2.t2.col0; , SQL parse error: Column Not found: db1.t2.str1](2) com._4paradigm.hybridse.sdk.SqlEngineTest ‑ sqlMultipleDBErrorTest[db1, SELECT db2.t2.str1 as t2_str1 FROM t1 last join db2.t2 order by db2.t2.col1 on t1.col1 = t2.col1 and t1.col2 = db2.t2.col0; , SQL parse error: Column Not found: .t2.col1](3) com._4paradigm.hybridse.sdk.SqlEngineTest ‑ sqlMultipleDBErrorTest[db1, SELECT t2.str1 as t2_str1 FROM t1 last join db2.t2 order by db2.t2.col1 on t1.col1 = db2.t2.col1 and t1.col2 = db2.t2.col0; , SQL parse error: Column Not found: .t2.str1](1) com._4paradigm.hybridse.sdk.SqlEngineTest ‑ sqlMultipleDBErrorTest[null, SELECT db2.t2.str1 as t2_str1 FROM t1 last join db2.t2 order by db2.t2.col1 on t1.col1 = db2.t2.col1 and t1.col2 = db2.t2.col0; , SQL parse error: Fail to transform data provider op: table t1 not exists in database []](5) com._4paradigm.hybridse.sdk.SqlEngineTest ‑ sqlWindowLastJoin[ SELECT sum(t1.col1) over w1 as sum_t1_col1, t2.str1 as t2_str1 FROM t1 last join t2 order by t2.col1 on t1.col1 = t2.col1 and t1.col2 = t2.col0 WINDOW w1 AS ( PARTITION BY t1.col2 ORDER BY t1.col1 ROWS_RANGE BETWEEN 3 PRECEDING AND CURRENT ROW ) limit 10;](1) ```

:recycle: This comment has been updated with latest results.

codecov[bot] commented 5 months ago

Codecov Report

Attention: Patch coverage is 82.83262% with 40 lines in your changes missing coverage. Please review.

Project coverage is 75.24%. Comparing base (59d79f6) to head (a55d65c). Report is 1 commits behind head on main.

Files Patch % Lines
hybridse/src/codegen/array_ir_builder.cc 68.42% 18 Missing :warning:
hybridse/src/codegen/struct_ir_builder.cc 67.85% 18 Missing :warning:
hybridse/src/codegen/ir_base_builder.cc 84.61% 2 Missing :warning:
hybridse/src/base/cartesian_product.cc 95.83% 1 Missing :warning:
hybridse/src/codegen/string_ir_builder.cc 88.88% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #3945 +/- ## ============================================ + Coverage 75.22% 75.24% +0.02% Complexity 711 711 ============================================ Files 754 755 +1 Lines 135575 135809 +234 Branches 2072 2073 +1 ============================================ + Hits 101991 102195 +204 - Misses 33281 33311 +30 Partials 303 303 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

github-actions[bot] commented 5 months ago

HybridSE Linux Test Report

20 423 tests  +48   20 421 :white_check_mark: +48   6m 18s :stopwatch: -5s    262 suites ± 0        2 :zzz: ± 0      69 files   ± 0        0 :x: ± 0 

Results for commit a55d65ca. ± Comparison against base commit 59d79f6d.

:recycle: This comment has been updated with latest results.

github-actions[bot] commented 5 months ago

HybridSE Mac Test Report

0 tests  ±0   0 :white_check_mark: ±0   0s :stopwatch: ±0s 0 suites ±0   0 :zzz: ±0  0 files   ±0   0 :x: ±0 

Results for commit a55d65ca. ± Comparison against base commit 59d79f6d.

:recycle: This comment has been updated with latest results.

github-actions[bot] commented 5 months ago

Linux Test Report

    59 files  ± 0     252 suites  ±0   1h 40m 24s :stopwatch: +8s 13 586 tests +66  13 579 :white_check_mark: +66  7 :zzz: ±0  0 :x: ±0  19 293 runs  +96  19 286 :white_check_mark: +96  7 :zzz: ±0  0 :x: ±0 

Results for commit a55d65ca. ± Comparison against base commit 59d79f6d.

:recycle: This comment has been updated with latest results.