facebookincubator / velox

A C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.
https://velox-lib.io/
Apache License 2.0
3.47k stars 1.14k forks source link

Get Spark expression fuzzer in parity with Presto expression fuzzer. #5967

Open kgpai opened 1 year ago

kgpai commented 1 year ago

Description

Currently, the Spark expression fuzzer has far fewer fuzzer flags enabled compared to Presto.

We need to get these to parity and thus harden the spark stack more. This means enabling support for flags such as:

--velox_fuzzer_enable_complex_types , 
 --velox_fuzzer_enable_column_reuse \
 --velox_fuzzer_enable_expression_reuse \
 --max_expression_trees_per_step 2 \
 --retry_with_try \
 --enable_dereference \

How to enable these flags

Since the spark expression fuzzer also uses the same underlying engine as the Presto fuzzer, what this means is that typically we enable a flag , say velox_fuzzer_enable_complex_types in the spark expression fuzzer and run it for some time. This flag will enable the utilization of udfs that use complex types and the fuzzer will create expressions and fuzz inputs to these udfs. Running the fuzzer for some period of time should find issues in how complex types are used in Spark udfs . Typically if the fuzzer is able to run with a flag enabled for an hour or so , then it gives us great confidence that most of the underlying issues are found. If not then each failure by the fuzzer needs to be identified and fixed and the fuzzer run again.

This will have to be done for each flag. Please reach out to anyone of @kgpai , @kagamiori , @bikramSingh91 or @laithsakka If you have any questions.

mbasmanova commented 1 year ago

@rui-mo Rui, would it be possible to have someone from Gluten team work on this?

rui-mo commented 1 year ago

@rui-mo Rui, would it be possible to have someone from Gluten team work on this?

@mbasmanova Yes, we can do that. I will firstly try these flags one by one locally in these two days, and keep you updated.

mbasmanova commented 1 year ago

Thank you, Rui.

rui-mo commented 1 year ago
I0807 09:32:24.277905 2450749 ExpressionVerifier.cpp:79] All results match.
I0807 09:32:24.277916 2450749 ExpressionFuzzer.cpp:1358] ==============================> Done with iteration 259509

Tested enable_variadic_signatures with below command, and other tests are in progress.

./spark_expression_fuzzer_test --seed ${RANDOM} --duration_sec 3600 --logtostderr=1 --minloglevel=0 --enable_variadic_signatures

rui-mo commented 1 year ago

These tests succeeded.

I0807 12:53:52.200788 2465881 ExpressionVerifier.cpp:79] All results match.
I0807 12:53:52.200796 2465881 ExpressionFuzzer.cpp:1358] ==============================> Done with iteration 8058015
I0807 12:53:52.200809 2465881 ExpressionFuzzer.cpp:1313] ==============================> Started iteration 8058016 (seed: 322683688)

./spark_expression_fuzzer_test --seed ${RANDOM} --duration_sec 3600 --logtostderr=1 --minloglevel=0 --velox_fuzzer_enable_column_reuse

I0807 14:10:56.230175 2473810 ExpressionVerifier.cpp:79] All results match.
I0807 14:10:56.230181 2473810 ExpressionFuzzer.cpp:1358] ==============================> Done with iteration 10066785

./spark_expression_fuzzer_test --seed ${RANDOM} --duration_sec 3600 --logtostderr=1 --minloglevel=0 --velox_fuzzer_enable_expression_reuse

I0807 15:39:59.606643 2506027 ExpressionVerifier.cpp:79] All results match.
I0807 15:39:59.606655 2506027 ExpressionFuzzer.cpp:1358] ==============================> Done with iteration 5306377

./spark_expression_fuzzer_test --seed ${RANDOM} --duration_sec 3600 --logtostderr=1 --minloglevel=0 --max_expression_trees_per_step 2

I0807 21:02:11.896662 2537718 ExpressionVerifier.cpp:79] All results match.
I0807 21:02:11.896667 2537718 ExpressionFuzzer.cpp:1358] ==============================> Done with iteration 7273875

./spark_expression_fuzzer_test --seed ${RANDOM} --duration_sec 3600 --logtostderr=1 --minloglevel=0 --retry_with_try

rui-mo commented 1 year ago

These two failed with below error:

I0807 21:03:57.384441 2543826 ExpressionFuzzer.cpp:1313] ==============================> Started iteration 2 (seed: 158801280)
I0807 21:03:57.384603 2543826 ExpressionVerifier.cpp:91] Executing expression 0 : notequalto("c0",CONCAT(0.46464866399765015)["row_field0"])
I0807 21:03:57.384618 2543826 ExpressionVerifier.cpp:31] 1 vectors as input:
I0807 21:03:57.384622 2543826 ExpressionVerifier.cpp:33]    [DICTIONARY REAL: 100 elements, 8 nulls], [DICTIONARY REAL: 100 elements, 12 nulls], [FLAT REAL: 100 elements, 5 nulls]
*** Aborted at 1691442237 (Unix time, try 'date -d @1691442237') ***
*** Signal 11 (SIGSEGV) (0x50) received by PID 2543826 (pthread TID 0x7f4277b7ec40) (linux TID 2543826) (code: address not mapped to object), stack trace: ***
(error retrieving stack trace)
Segmentation fault (core dumped)

/spark_expression_fuzzer_test --seed ${RANDOM} --duration_sec 3600 --logtostderr=1 --minloglevel=0 --enable_dereference

terminate called after throwing an instance of 'facebook::velox::VeloxRuntimeError'
  what():  Exception: VeloxRuntimeError
Error Source: RUNTIME
Error Code: INVALID_STATE
Reason: Cannot use null as map key!
Retriable: False
Expression: !decoded->isNullAt(row)
Context: map(notequalto(add(element_at(<empty>:MAP<REAL,REAL>, unaryminus(subtract(c0, null:REAL))), 1.538642406463623:REAL), 0:REAL), c1, greaterthanorequal(element_at(c2, 23:TINYINT), O?gXkxQ-v;WsG};\_o)*h:cQS?s{Oo0-*~@)<q4pp43$G</`(D(:VARBINARY), Q4@QU3W6~iF:VARCHAR)
Top-Level Context: endswith(element_at(map(notequalto(add(element_at(<empty>:MAP<REAL,REAL>, unaryminus(subtract(c0, null:REAL))), 1.538642406463623:REAL), 0:REAL), c1, greaterthanorequal(element_at(c2, 23:TINYINT), O?gXkxQ-v;WsG};\_o)*h:cQS?s{Oo0-*~@)<q4pp43$G</`(D(:VARBINARY), Q4@QU3W6~iF:VARCHAR), lessthanorequal(c3, lessthanorequal(c4, multiply(76:TINYINT, bitwise_and(remainder(bit_get(shiftleft(to_unix_timestamp(null:VARCHAR, c5), null:INTEGER), 1736524976:INTEGER), abs(unaryminus(c6))), add(c7, pmod(80:TINYINT, c8))))))), lpad(sha2(null:VARBINARY, c9), 673723844:INTEGER))
Function: operator()
File: ../../velox/functions/sparksql/Map.cpp

./spark_expression_fuzzer_test --seed ${RANDOM} --duration_sec 3600 --logtostderr=1 --minloglevel=0 --velox_fuzzer_enable_complex_types

rui-mo commented 1 year ago

@kgpai Just opened https://github.com/facebookincubator/velox/pull/6029 to enable the successful ones first, and --lazy_vector_generation_ratio 0.2 is under testing. I may need more time to check the failed two flags.

rui-mo commented 1 year ago

Test with --lazy_vector_generation_ratio 0.2 also succeeded, add it https://github.com/facebookincubator/velox/pull/6029 also.

> I0808 08:40:32.684978 2551413 ExpressionVerifier.cpp:79] All results match.
I0808 08:40:32.684988 2551413 ExpressionFuzzer.cpp:1358] ==============================> Done with iteration 7213334

./spark_expression_fuzzer_test --seed ${RANDOM} --duration_sec 3600 --logtostderr=1 --minloglevel=0 --lazy_vector_generation_ratio 0.2

rui-mo commented 10 months ago

With https://github.com/facebookincubator/velox/pull/7875, tested enable_dereference flag for an hour.

I1206 09:54:35.748194 3320966 ExpressionVerifier.cpp:85] All results match.
I1206 09:54:35.748207 3320966 ExpressionFuzzerVerifier.cpp:506] ==============================> Done with iteration 6366781