facebookincubator / velox

A composable and fully extensible C++ execution engine library for data management systems.
https://velox-lib.io/
Apache License 2.0
3.48k stars 1.14k forks source link

Fuzzer test fail in Spark function concat_ws #6590

Open unigof opened 1 year ago

unigof commented 1 year ago

Description

I add Spark function concat_ws support, but meet error in fuzzer test. It seems lead to fuzzer test fail because input arg contains illeagel ")" in concat_ws. input arg: S/xlzG$+8*8M:M)-/m$h# function call: concat_ws(S/xlzG$+8*8M:M)-/m$h#,upper("c0"))

https://app.circleci.com/pipelines/github/facebookincubator/velox/33585/workflows/ec92409f-3df2-408d-ba56-d47cef4db6d9/jobs/217808

But I found a curious situation: When I changed VLOG(1) to VLOG(0) for print log, fuzzer test was succeed: https://app.circleci.com/pipelines/github/facebookincubator/velox/33580/workflows/e4d98232-2f5e-4226-863b-0aa14385cc69/jobs/217764 https://app.circleci.com/pipelines/github/facebookincubator/velox/33482/workflows/1deb1a8d-dc6e-4267-b5ab-c69a76d421d3/jobs/216938

Error Reproduction

retrigger pipeline Here is pr: https://github.com/facebookincubator/velox/pull/6292

Relevant logs

I20230915 06:12:30.982172 188968 ExpressionFuzzer.cpp:1358] ==============================> Done with iteration 272
I20230915 06:12:30.982214 188968 ExpressionFuzzer.cpp:1313] ==============================> Started iteration 273 (seed: 3454297971)
I20230915 06:12:30.982656 188968 ExpressionVerifier.cpp:86] Executing expression 0 : remainder(shiftleft(ascii(concat_ws(S/xlzG$+8*8M:M)-/m$h#,upper("c0"))),ascii(concat_ws(S/xlzG$+8*8M:M)-/m$h#,upper("c0")))),1672119067)
I20230915 06:12:30.982684 188968 ExpressionVerifier.cpp:86] Executing expression 1 : floor("c1")
I20230915 06:12:30.983709 188968 ExpressionVerifier.cpp:74] All results match.
I20230915 06:12:30.984624 188968 ExpressionVerifier.cpp:74] All results match.
E20230915 06:12:30.984719 188968 Exceptions.h:69] Line: ../../velox/expression/tests/ExpressionVerifier.cpp:66, Function:operator(), Expression: left->equalValueAt(right.get(), row, row) Different results at idx '13': '54525952' vs. '0', Source: RUNTIME, ErrorCode: INVALID_STATE
I20230915 06:12:30.985608 188968 ExpressionVerifier.cpp:346] Persisted input: --fuzzer_repro_path /tmp/spark_fuzzer_repro/velox_expressionVerifier_QCWVyl --input_path /tmp/spark_fuzzer_repro/velox_expressionVerifier_QCWVyl/input_vector --result_path /tmp/spark_fuzzer_repro/velox_expressionVerifier_QCWVyl/result_vector --sql_path /tmp/spark_fuzzer_repro/velox_expressionVerifier_QCWVyl/sql
terminate called after throwing an instance of 'facebook::velox::VeloxRuntimeError'
  what():  Exception: VeloxRuntimeError
Error Source: RUNTIME
Error Code: INVALID_STATE
Reason: Different results at idx '13': '54525952' vs. '0'
Retriable: False
Expression: left->equalValueAt(right.get(), row, row)
Function: operator()
File: ../../velox/expression/tests/ExpressionVerifier.cpp
Line: 66
Stack trace:
# 0  _ZN8facebook5velox7process10StackTraceC1Ei
# 1  _ZN8facebook5velox14VeloxException5State4makeIZNS1_C4EPKcmS5_St17basic_string_viewIcSt11char_traitsIcEES9_S9_S9_bNS1_4TypeES9_EUlRT_E_EESt10shared_ptrIKS2_ESA_SB_
# 2  _ZN8facebook5velox14VeloxExceptionC1EPKcmS3_St17basic_string_viewIcSt11char_traitsIcEES7_S7_S7_bNS1_4TypeES7_
# 3  _ZN8facebook5velox17VeloxRuntimeErrorC2EPKcmS3_St17basic_string_viewIcSt11char_traitsIcEES7_S7_S7_bS7_
# 4  _ZN8facebook5velox6detail14veloxCheckFailINS0_17VeloxRuntimeErrorERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEvRKNS1_18VeloxCheckFailArgsET0_
# 5  _ZZN8facebook5velox4test12_GLOBAL__N_114compareVectorsERKSt10shared_ptrINS0_10BaseVectorEES7_RKNS0_17SelectivityVectorEENKUliE0_clEi
# 6  _ZNK8facebook5velox17SelectivityVector15applyToSelectedIZNS0_4test12_GLOBAL__N_114compareVectorsERKSt10shared_ptrINS0_10BaseVectorEES9_RKS1_EUliE0_EEvT_
# 7  _ZN8facebook5velox4test12_GLOBAL__N_114compareVectorsERKSt10shared_ptrINS0_10BaseVectorEES7_RKNS0_17SelectivityVectorE
# 8  _ZN8facebook5velox4test18ExpressionVerifier6verifyERKSt6vectorISt10shared_ptrIKNS0_4core10ITypedExprEESaIS8_EERKS4_INS0_9RowVectorEEOS4_INS0_10BaseVectorEEbS3_IiSaIiEE
# 9  _ZN8facebook5velox4test16ExpressionFuzzer2goEv
# 10 _ZN8facebook5velox4test16expressionFuzzerESt13unordered_mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESt6vectorIPKNS0_4exec17FunctionSignatureESaISD_EESt4hashIS8_ESt8equal_toIS8_ESaISt4pairIKS8_SF_EEEm
# 11 _ZN12FuzzerRunner12runFromGtestERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEmRKSt13unordered_setIS5_St4hashIS5_ESt8equal_toIS5_ESaIS5_EES7_
# 12 _ZN12FuzzerRunner3runERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEmRKSt13unordered_setIS5_St4hashIS5_ESt8equal_toIS5_ESaIS5_EES7_
# 13 main
# 14 __libc_start_main
# 15 _start

*** Aborted at 1694758351 (Unix time, try 'date -d @1694758351') ***
*** Signal 6 (SIGABRT) (0x2e228) received by PID 188968 (pthread TID 0x7f9301d02500) (linux TID 188968) (maybe from PID 188968, UID 0) (code: -6), stack trace: ***
(error retrieving stack trace)
/bin/bash: line 14: 188968 Aborted                 (core dumped) _build/debug/velox/expression/tests/spark_expression_fuzzer_test --seed ${RANDOM} --duration_sec 60 --enable_variadic_signatures --lazy_vector_generation_ratio 0.2 --velox_fuzzer_enable_column_reuse --velox_fuzzer_enable_expression_reuse --max_expression_trees_per_step 2 --retry_with_try --logtostderr=1 --minloglevel=0 --repro_persist_path=/tmp/spark_fuzzer_repro
unigof commented 1 year ago

@kgpai @laithsakka @duanmeng @zacw7 Hi, could you help see what problem is? Thank you very much

mbasmanova commented 11 months ago

CC: @rui-mo

rui-mo commented 11 months ago

@unigof I tried the fuzzer test with below command.

./spark_expression_fuzzer_test --seed ${RANDOM} --duration_sec 60 --enable_variadic_signatures --lazy_vector_generation_ratio 0.2 --velox_fuzzer_enable_column_reuse --velox_fuzzer_enable_expression_reuse --max_expression_trees_per_step 2 --retry_with_try --logtostderr=1 --minloglevel=0 --only concat_ws

The error I got was:

Executing expression 0 : concat_ws(<qk9S00QNAB-m5)qjy6UtZ&#XK\{F4v[3R8EuEL^Alx|G9ND9+:3N\J$j,concat_ws(null,"c0"),_N%09>00&7XEB0-^v~S^>YK9Ltq1F=wF]MOlra8d/EPxs#R,"c0")
I1127 16:45:55.438975 1580608 ExpressionVerifier.cpp:74] All results match.
E1127 16:45:55.438993 1580608 Exceptions.h:69] Line: ../../velox/expression/tests/ExpressionVerifier.cpp:66, Function:operator(), Expression: left->equalValueAt(right.get(), row, row) Different results at idx '14': '_N%09>00&7XEB0-^v~S^>YK9Ltq1F=wF]MOlra8d/EPxs#R<qk9S00QNAB-m5)qjy6UtZ&#XK\{F4v[3R8EuEL^Alx|G9ND9+:3N\J$j?xHK#GToc\bK$?Nx>A4J' vs. '_N%09>00&7XEB0-^v~S^>YK9Ltq1F=wF]MOlra8d/EPxs#R', Source: RUNTIME, ErrorCode: INVALID_STATE
I1127 16:45:55.439056 1580608 ExpressionVerifier.cpp:258] Skipping persistence because repro path is empty.
terminate called after throwing an instance of 'facebook::velox::VeloxRuntimeError'
  what():  Exception: VeloxRuntimeError
Error Source: RUNTIME
Error Code: INVALID_STATE
Reason: Different results at idx '14': '_N%09>00&7XEB0-^v~S^>YK9Ltq1F=wF]MOlra8d/EPxs#R<qk9S00QNAB-m5)qjy6UtZ&#XK\{F4v[3R8EuEL^Alx|G9ND9+:3N\J$j?xHK#GToc\bK$?Nx>A4J' vs. '_N%09>00&7XEB0-^v~S^>YK9Ltq1F=wF]MOlra8d/EPxs#R'
Retriable: False
Expression: left->equalValueAt(right.get(), row, row)
Function: operator()
File: ../../velox/expression/tests/ExpressionVerifier.cpp
Line: 66

Looks like some part of string is missing on the second result.