Closed shamb0 closed 1 week ago
Hi @b41sh,
I wanted to let you know that I've completed the first draft of the map_cat
implementation. When you have a moment, I would greatly appreciate it if you could review the code and provide your feedback. Please feel free to highlight any areas that need improvement or suggest alternative approaches.
Thank you for taking the time to assist me with this task.
Hi @b41sh,
I wanted to let you know that I've completed the first draft of the
map_cat
implementation. When you have a moment, I would greatly appreciate it if you could review the code and provide your feedback. Please feel free to highlight any areas that need improvement or suggest alternative approaches.Thank you for taking the time to assist me with this task.
Hi, @shamb0
Thank you very much for your contribution, but I'm sorry, there are some problem with this implementation. map_cat
is a function of type map, not variant. It's true that an object of type JSON can also be considered as a map, but this is different. I'm sorry that I didn't make it clear and caused you problems.
You can start by getting an introduction to the map type, and referring to the implementations of the map_keys
and map_values
functions in this PR #15291
Hi @b41sh, I wanted to let you know that I've completed the first draft of the
map_cat
implementation. When you have a moment, I would greatly appreciate it if you could review the code and provide your feedback. Please feel free to highlight any areas that need improvement or suggest alternative approaches. Thank you for taking the time to assist me with this task.Hi, @shamb0 Thank you very much for your contribution, but I'm sorry, there are some problem with this implementation.
map_cat
is a function of type map, not variant. It's true that an object of type JSON can also be considered as a map, but this is different. I'm sorry that I didn't make it clear and caused you problems.You can start by getting an introduction to the map type, and referring to the implementations of the
map_keys
andmap_values
functions in this PR #15291
Hi @b41sh,
Thank you for the suggestion. I appreciate the guidance, and I will proceed with migrating to the map_type
approach as recommended.
Hi, @shamb0 I found an article that describes how to write scalar functions, and documentation that describes the map type, which you can refer to for implementation.
https://docs.databend.com/guides/community/contributor/how-to-write-scalar-functions https://docs.databend.com/sql/sql-reference/data-types/data-type-map
The parameter of the map_cat
function is a KvColumn, we can use KvColumnBuilder to construct the result column after concat.
We have implemented a lot of Array functions, the structure is similar to the Map function, you can refer to the implementations.
Hi @b41sh,
I appreciate your detailed pointers; they're really helping me to make the progress!
I need your assistance in reviewing the current map_cat
implementation found in src/query/functions/src/scalars/map_ops.rs
. I've observed some issues with concatenation results and believe it would be beneficial to discuss this with you before diving into debugging.
Reproducing the issue is straightforward with a basic unit test using test_map_cat_ops_basic()
, which is implemented in src/query/functions/tests/it/scalars/map_ops.rs
.
RUST_LOG=trace \
UPDATE_GOLDENFILES=1 \
cargo test \
--test it scalars::map_ops::test_map_ops
Please find the complete trace here.
Please let me know your thoughts when you have a moment.
Thank you for your ongoing support!
Hi, @shamb0 Sorry for the late reply, it looks like a lot of progress has been made, continuing to fix a few minor issues and it will be ready.
Hi @b41sh,
No worries, Thank you for the thorough review and feedback. I appreciate the level of detail you've provided. I'll incorporate your feedback and get the update back to you ASAP.
Hi @b41sh,
I wanted to give you a quick update on the review reworks I'm working on. They're progressing slowly.
I did want to bring up a challenge I encountered with the ArrayColumnBuilder::append_column()
function and get your thoughts on the best approach. In the src/query/functions/src/scalars/map.rs
file, I've implemented two versions of the map_cat
function:
map_cat
without using ArrayType
map_cat_impl2
using ArrayType
Both implementations are currently working, but Implementation-02 required a quick patch on the end_offset
calculation within ArrayColumnBuilder::append_column()
. Please consider this patch as a bit of a temporary fix, and I'm actively working on finding a more elegant solution.
Here's where I could use your insight: since output_map
is of type ArrayColumnBuilder<KvColumnBuilder<>>
, I'm wondering if we could insert the datasets from the source lhs
and rhs
maps directly into output_map
without using ArrayType
, similar to what I did in Implementation-01. From what I've observed, this approach seems to have a lower cost during the concatenation process compared to Implementation-02.
I'd really appreciate your thoughts on the optimal implementation for this context.
Please let me know if you need any further clarification or have additional suggestions.
Thanks for your time and support, @b41sh. I really appreciate it.
Hi @b41sh,
I wanted to give you a quick update on the review reworks I'm working on. They're progressing slowly.
I did want to bring up a challenge I encountered with the
ArrayColumnBuilder::append_column()
function and get your thoughts on the best approach. In thesrc/query/functions/src/scalars/map.rs
file, I've implemented two versions of themap_cat
function:
- Implementation-01:
map_cat
without usingArrayType
- Implementation-02:
map_cat_impl2
usingArrayType
Both implementations are currently working, but Implementation-02 required a quick patch on the
end_offset
calculation withinArrayColumnBuilder::append_column()
. Please consider this patch as a bit of a temporary fix, and I'm actively working on finding a more elegant solution.Here's where I could use your insight: since
output_map
is of typeArrayColumnBuilder<KvColumnBuilder<>>
, I'm wondering if we could insert the datasets from the sourcelhs
andrhs
maps directly intooutput_map
without usingArrayType
, similar to what I did in Implementation-01. From what I've observed, this approach seems to have a lower cost during the concatenation process compared to Implementation-02.I'd really appreciate your thoughts on the optimal implementation for this context.
Please let me know if you need any further clarification or have additional suggestions.
Thanks for your time and support, @b41sh. I really appreciate it.
we can use put_item
function in ArrayColumnBuilder
, and the value is a KvPair
struct.
for example
let mut builder = ArrayType::create_builder(capacity, ctx.generics);
for (k1, v1) in input_map1.iter() {
builder.put_item((k1, v1));
}
Hi @b41sh,
I've incorporated updates to meet the review feedback requirements. Could you kindly review them and advise if there are any outstanding areas that require attention?
Regarding the sqllogictest
and unit tests
, I've added basic sanity tests to address the essentials. If you believe additional test coverage is necessary, I'm happy to work on that. Just let me know your thoughts.
Hi @b41sh,
- I've incorporated updates to meet the review feedback requirements. Could you kindly review them and advise if there are any outstanding areas that require attention?
- Regarding the
sqllogictest
andunit tests
, I've added basic sanity tests to address the essentials. If you believe additional test coverage is necessary, I'm happy to work on that. Just let me know your thoughts.
Hi @shamb0 Thank you very much for the changes, the implementation of the function looks fine to me. But it looks like some codes are temporary debugging code, could you please remove it to keep the code tidy and complete. Also the ci tests are failing, you'll need to make some changes to the code to ensure it passes all those tests.
Hi @b41sh,
- I've incorporated updates to meet the review feedback requirements. Could you kindly review them and advise if there are any outstanding areas that require attention?
- Regarding the
sqllogictest
andunit tests
, I've added basic sanity tests to address the essentials. If you believe additional test coverage is necessary, I'm happy to work on that. Just let me know your thoughts.Hi @shamb0 Thank you very much for the changes, the implementation of the function looks fine to me. But it looks like some codes are temporary debugging code, could you please remove it to keep the code tidy and complete. Also the ci tests are failing, you'll need to make some changes to the code to ensure it passes all those tests.
Hi @b41sh,
I wanted to check if we could incorporate some changes in the tests/sqllogictests/
directory. The main updates are aimed at making the run_file
CLI argument fully operational. This would enable us to complete the sqllogictests
more efficiently at the test suite file level.
Here's an example command to illustrate how the run_file
argument works:
clear && \
cargo run \
-p databend-sqllogictests \
--bin databend-sqllogictests \
-- \
--handlers "http" \
--run_dir functions \
--run_file 02_0074_function_map.test \
--debug \
--parallel 2
Please let me know your thought.
Hi @b41sh,
I wanted to check if we could incorporate some changes in the
tests/sqllogictests/
directory. The main updates are aimed at making therun_file
CLI argument fully operational. This would enable us to complete thesqllogictests
more efficiently at the test suite file level.Here's an example command to illustrate how the
run_file
argument works:clear && \ cargo run \ -p databend-sqllogictests \ --bin databend-sqllogictests \ -- \ --handlers "http" \ --run_dir functions \ --run_file 02_0074_function_map.test \ --debug \ --parallel 2
Please let me know your thought.
unit-test
fails, you need to modify the file src/query/functions/tests/it/scalars/testdata/function_list.txt
to add map_cat
to the function list, and then run make unit-test
to check whether unit-test
is passed.
The sql logic test can be run like this
First start the databend meta and query, either with the make run-debug
command or manually with the command
./target/debug/databend-meta --single --log-level=ERROR
./target/debug/databend-query -c scripts/ci/deploy/config/databend-query-node-1.toml
Then run sqllogictests with the following command
./target/debug/databend-sqllogictests --handlers mysql --run_dir query --skip_dir spill,join,window_function,issues,case_sensitivity
Hi @shamb0 This PR is almost OK, please resolve the code conflicts with the main branch, and fix the ci unit test.
Hi @b41sh, I wanted to check if we could incorporate some changes in the
tests/sqllogictests/
directory. The main updates are aimed at making therun_file
CLI argument fully operational. This would enable us to complete thesqllogictests
more efficiently at the test suite file level. Here's an example command to illustrate how therun_file
argument works:clear && \ cargo run \ -p databend-sqllogictests \ --bin databend-sqllogictests \ -- \ --handlers "http" \ --run_dir functions \ --run_file 02_0074_function_map.test \ --debug \ --parallel 2
Please let me know your thought.
unit-test
fails, you need to modify the filesrc/query/functions/tests/it/scalars/testdata/function_list.txt
to addmap_cat
to the function list, and then runmake unit-test
to check whetherunit-test
is passed.The sql logic test can be run like this First start the databend meta and query, either with the
make run-debug
command or manually with the command./target/debug/databend-meta --single --log-level=ERROR ./target/debug/databend-query -c scripts/ci/deploy/config/databend-query-node-1.toml
Then run sqllogictests with the following command
./target/debug/databend-sqllogictests --handlers mysql --run_dir query --skip_dir spill,join,window_function,issues,case_sensitivity
Hi @b41sh,
I created the file src/query/functions/tests/it/scalars/testdata/function_list.txt
and ran the make unit-test
command as requested. However, I encountered a failure related to databend-common-openai
.
Can you please advise if it's okay to ignore this failure and move forward, or if there are any additional steps I should take to resolve it?
PASS [18.870s] databend-common-meta-raft-store::it state_machine::schema_api_impl::test_meta_embedded_single
------------
Summary [25.154s] 1328/1860 tests run: 1326 passed, 2 failed, 4 skipped
FAIL [0.599s] databend-common-openai::it openai::test_openai_sql_completion
FAIL [0.586s] databend-common-openai::it openai::test_openai_text_completion
error: test run failed
make: *** [Makefile:70: unit-test] Error 100
Hi @shamb0 This PR is almost OK, please resolve the code conflicts with the main branch, and fix the ci unit test.
Thank you @b41sh. I am working on it. Around 50% done so far. I'll have the rest wrapped up within the next hour.
Hi @b41sh,
I created the file
src/query/functions/tests/it/scalars/testdata/function_list.txt
and ran themake unit-test
command as requested. However, I encountered a failure related todatabend-common-openai
.Can you please advise if it's okay to ignore this failure and move forward, or if there are any additional steps I should take to resolve it?
PASS [18.870s] databend-common-meta-raft-store::it state_machine::schema_api_impl::test_meta_embedded_single ------------ Summary [25.154s] 1328/1860 tests run: 1326 passed, 2 failed, 4 skipped FAIL [0.599s] databend-common-openai::it openai::test_openai_sql_completion FAIL [0.586s] databend-common-openai::it openai::test_openai_text_completion error: test run failed make: *** [Makefile:70: unit-test] Error 100
src/query/functions/tests/it/scalars/testdata/function_list.txt
has exists, don't need to add a new file. The result diff is like following, just add the map_cat
function like this is OK, other fails may related with environment and can be ignored.
Differences (-left|+right):
0 map(Array(Nothing), Array(Nothing)) :: Map(Nothing)
1 map(Array(Nothing) NULL, Array(Nothing) NULL) :: Map(Nothing) NULL
2 map(Array(T0), Array(T1)) :: Map(T0, T1)
3 map(Array(T0) NULL, Array(T1) NULL) :: Map(T0, T1) NULL
+0 map_cat(Map(Nothing), Map(Nothing)) :: Map(Nothing)
+1 map_cat(Map(Nothing) NULL, Map(Nothing) NULL) :: Map(Nothing) NULL
+2 map_cat(Map(T0, T1), Map(T0, T1)) :: Map(T0, T1)
+3 map_cat(Map(T0, T1) NULL, Map(T0, T1) NULL) :: Map(T0, T1) NULL
0 map_keys(Map(Nothing)) :: Array(Nothing)
1 map_keys(Map(T0, T1)) :: Array(T0)
2 map_keys(Map(T0, T1) NULL) :: Array(T0) NULL
0 map_values(Map(Nothing)) :: Array(Nothing)
Hi @b41sh, I created the file
src/query/functions/tests/it/scalars/testdata/function_list.txt
and ran themake unit-test
command as requested. However, I encountered a failure related todatabend-common-openai
. Can you please advise if it's okay to ignore this failure and move forward, or if there are any additional steps I should take to resolve it?PASS [18.870s] databend-common-meta-raft-store::it state_machine::schema_api_impl::test_meta_embedded_single ------------ Summary [25.154s] 1328/1860 tests run: 1326 passed, 2 failed, 4 skipped FAIL [0.599s] databend-common-openai::it openai::test_openai_sql_completion FAIL [0.586s] databend-common-openai::it openai::test_openai_text_completion error: test run failed make: *** [Makefile:70: unit-test] Error 100
src/query/functions/tests/it/scalars/testdata/function_list.txt
has exists, don't need to add a new file. The result diff is like following, just add themap_cat
function like this is OK, other fails may related with environment and can be ignored.Differences (-left|+right): 0 map(Array(Nothing), Array(Nothing)) :: Map(Nothing) 1 map(Array(Nothing) NULL, Array(Nothing) NULL) :: Map(Nothing) NULL 2 map(Array(T0), Array(T1)) :: Map(T0, T1) 3 map(Array(T0) NULL, Array(T1) NULL) :: Map(T0, T1) NULL +0 map_cat(Map(Nothing), Map(Nothing)) :: Map(Nothing) +1 map_cat(Map(Nothing) NULL, Map(Nothing) NULL) :: Map(Nothing) NULL +2 map_cat(Map(T0, T1), Map(T0, T1)) :: Map(T0, T1) +3 map_cat(Map(T0, T1) NULL, Map(T0, T1) NULL) :: Map(T0, T1) NULL 0 map_keys(Map(Nothing)) :: Array(Nothing) 1 map_keys(Map(T0, T1)) :: Array(T0) 2 map_keys(Map(T0, T1) NULL) :: Array(T0) NULL 0 map_values(Map(Nothing)) :: Array(Nothing)
I've made some changes to src/query/functions/tests/it/scalars/testdata/function_list.txt
. Could you spare a few minutes to review and ensure they align with the requirements? Your quick feedback would be appreciated.
I've made some changes to
src/query/functions/tests/it/scalars/testdata/function_list.txt
. Could you spare a few minutes to review and ensure they align with the requirements? Your quick feedback would be appreciated.
unit test is fine, then remove the temporary modification code and we can merge this PR.
mysql
I'm running into some issues while trying to complete the sqllogictest
.
Please find the full trace in the log.
Commands executed:
./target/debug/databend-meta \
--single \
--log-level=INFO
./target/debug/databend-query \
-c scripts/ci/deploy/config/databend-query-node-1.toml \
--internal-enable-sandbox-tenant \
--log-level=INFO
cargo run \
-p databend-sqllogictests \
--bin databend-sqllogictests \
-- \
--handlers "mysql" \
--run_dir query \
--skip_dir spill,join,window_function,issues,case_sensitivity \
--debug \
--parallel 2
Are these known issues? Can I ignore them for this PR?
Thanks for your help!
mysql
I'm running into some issues while trying to complete the
sqllogictest
.
- Please find the full trace in the log.
- Commands executed:
./target/debug/databend-meta \ --single \ --log-level=INFO ./target/debug/databend-query \ -c scripts/ci/deploy/config/databend-query-node-1.toml \ --internal-enable-sandbox-tenant \ --log-level=INFO cargo run \ -p databend-sqllogictests \ --bin databend-sqllogictests \ -- \ --handlers "mysql" \ --run_dir query \ --skip_dir spill,join,window_function,issues,case_sensitivity \ --debug \ --parallel 2
- Are these known issues? Can I ignore them for this PR?
- Thanks for your help!
this is because you add test after drop database map_func_test
please make sure this SQL at the end of tests/sqllogictests/suites/query/functions/02_0074_function_map.test
file
statement ok
DROP DATABASE map_func_test
LGTM @shamb0 Finally it works, thank you for your constant modifications and retries in the last few days, feel free to keep contributing other code!
Thank you @b41sh,
Thanks for your guidance throughout this PR journey. I've learned a lot from working with you on this stack.
I'm eager to continue contributing. If you have any priority issues in mind, please feel free to assign them to me.
Looking forward to the next challenge!
@shamb0 I'm glad I could help you, we still have a few more map functions to implement, so if you're interested and have the time, you can pick one and develop it, I'm sure with this experience, the subsequent development will be easy.
@shamb0 I'm glad I could help you, we still have a few more map functions to implement, so if you're interested and have the time, you can pick one and develop it, I'm sure with this experience, the subsequent development will be easy.
Thank you @b41sh, I can take this on.
I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/
Summary
Implement
map_cat
functionCloses #15337
Addresses part of 15295
Tests
Type of change
This change isโ