lakehq / sail

LakeSail's computation framework with a mission to unify stream processing, batch processing, and compute-intensive (AI) workloads.
https://lakesail.com
Apache License 2.0
525 stars 13 forks source link

Improve actor error handling #263

Closed linhr closed 1 month ago

linhr commented 1 month ago

This PR changes the signature of a few actor methods to enforce error handling inside actor message handlers and actor-spawned tasks.

github-actions[bot] commented 1 month ago

Spark Test Report

Commit Information

Commit Revision Branch
After 0663696 refs/pull/263/merge
Before c7e4bd8 refs/heads/main

Test Summary

Suite Commit Failed Passed Skipped Warnings Time (s)
doctest-column After 4 29 3 5.66
Before 4 29 3 6.22
doctest-dataframe After 50 56 1 4 7.19
Before 49 57 1 4 7.20
doctest-functions After 190 213 6 8 11.16
Before 190 213 6 8 11.71
test-connect After 407 631 126 250 109.19
Before 407 631 126 250 111.03

Test Details

Error Counts ```text (+1) 651 Total 313 Total Unique -------- ---- ---------------------------------------------------------------------------------------------------------- (+1) 33 DocTestFailure 27 UnsupportedOperationException: map partitions 23 UnsupportedOperationException: co-group map 23 UnsupportedOperationException: group map 15 UnsupportedOperationException: streaming query manager command 14 UnsupportedOperationException: lambda function 10 UnsupportedOperationException: unsupported data source format: Some("text") 10 handle add artifacts 8 AssertionError: AnalysisException not raised 8 PySparkAssertionError: [DIFFERENT_PANDAS_DATAFRAME] DataFrames are not almost equal: 8 PythonException: 8 PythonException: KeyError: 0 8 UnsupportedOperationException: hint 7 AssertionError: False is not true 6 IllegalArgumentException: invalid argument: UDF function type must be Python UDF 6 UnsupportedOperationException: function in table factor 6 UnsupportedOperationException: write stream operation start 5 AnalysisException: Cannot cast to Decimal128(14, 7). Overflowing on NaN 5 IllegalArgumentException: invalid argument: expecting either join condition or using columns 5 UnsupportedOperationException: corr 5 UnsupportedOperationException: function: monotonically_increasing_id 4 AnalysisException: Error during planning: view not found: v 4 AnalysisException: Execution error: 'Utf8("INTERVAL '0 00:00:00.000123' DAY TO SECOND") = CAST(#1 AS... 4 PySparkNotImplementedError: [NOT_IMPLEMENTED] rdd() is not implemented. 4 UnsupportedOperationException: cov 4 UnsupportedOperationException: function: window 4 UnsupportedOperationException: interval unit: day-time 4 UnsupportedOperationException: replace 4 UnsupportedOperationException: sample 4 UnsupportedOperationException: sample by 4 UnsupportedOperationException: unknown aggregate function: hll_sketch_agg 4 UnsupportedOperationException: unpivot 3 AnalysisException: Error during planning: Table function 'range' not found 3 AnalysisException: Error during planning: The expression to get an indexed field is only valid for `... 3 AnalysisException: Execution error: Date part 'D' not supported 3 AssertionError: "[('a', [('b', 'c')])]" != "{'a': {'b': 'c'}}" 3 AssertionError: "{'col1': 1, 'struct(John, 30, struct(valu[87 chars]0}}}" != "Row(col1=1, col2=Row(c... 3 AssertionError: Lists differ: [Row(a=1, a=1, b='x')] != [Row(a=1, b='x')] 3 AssertionError: PythonException not raised 3 IllegalArgumentException: expected value at line 1 column 1 3 SparkRuntimeException: Internal error: UDF returned a different number of rows than expected. Expect... 3 UnsupportedOperationException: crosstab 3 UnsupportedOperationException: function: input_file_name 3 UnsupportedOperationException: function: pmod 3 UnsupportedOperationException: function: ~ 3 UnsupportedOperationException: handle analyze input files 3 UnsupportedOperationException: to schema 3 ValueError: Converting to Python dictionary is not supported when duplicate field names are present 2 AnalysisException: Cannot cast from struct to other types except struct 2 AnalysisException: Cannot cast list to non-list data types 2 AnalysisException: Error during planning: The expression to get an indexed field is only valid for `... 2 AnalysisException: Error during planning: cannot resolve attribute: ObjectName([Identifier("a"), Ide... 2 AnalysisException: Error during planning: two values expected: [Alias(Alias { expr: Column(Column { ... 2 AnalysisException: Execution error: map requires all value types to be the same 2 AnalysisException: Invalid or Unsupported Configuration: Could not find config namespace "spark" 2 AssertionError 2 AssertionError: "TABLE_OR_VIEW_NOT_FOUND" does not match "Error during planning: cannot resolve attr... 2 AssertionError: Lists differ: [Row([22 chars](key=1, value='1'), Row(key=10, value='10'), R[2402 cha... 2 AssertionError: Lists differ: [Row(a=1, a=1, b=2)] != [Row(a=1, b=2)] 2 AssertionError: Lists differ: [Row(key='input', value=1.0)] != [Row(key='avg', value=1.0), Row(key='... 2 AssertionError: True is not false 2 IllegalArgumentException: invalid argument: empty data source paths 2 IllegalArgumentException: invalid argument: sql parser error: Expected: ), found: id at Line: 1, Col... 2 IllegalArgumentException: invalid argument: sql parser error: Expected: ), found: id at Line: 1, Col... 2 IllegalArgumentException: invalid argument: sql parser error: Expected: ), found: id at Line: 3, Col... 2 IllegalArgumentException: invalid argument: sql parser error: Expected: ), found: id at Line: 5, Col... 2 IllegalArgumentException: invalid argument: sql parser error: Expected: ), found: t at Line: 3, Colu... 2 IllegalArgumentException: invalid argument: sql parser error: Expected: ), found: v at Line: 1, Colu... 2 IllegalArgumentException: must either specify a row count or at least one column 2 SparkRuntimeException: Internal error: start_from index out of bounds. 2 UnsupportedOperationException: Only literal expr are supported in Python UDTFs for now, got expr: ma... 2 UnsupportedOperationException: Only literal expr are supported in Python UDTFs for now, got expr: na... 2 UnsupportedOperationException: Only literal expr are supported in Python UDTFs for now, got expr: ra... 2 UnsupportedOperationException: Only literal expr are supported in Python UDTFs for now, got expr: sp... 2 UnsupportedOperationException: approx quantile 2 UnsupportedOperationException: collect metrics 2 UnsupportedOperationException: decimal literal with precision or scale 2 UnsupportedOperationException: describe 2 UnsupportedOperationException: freq items 2 UnsupportedOperationException: function: bitmap_bit_position 2 UnsupportedOperationException: function: crc32 2 UnsupportedOperationException: function: dayofweek 2 UnsupportedOperationException: function: encode 2 UnsupportedOperationException: function: format_number 2 UnsupportedOperationException: function: from_csv 2 UnsupportedOperationException: function: from_json 2 UnsupportedOperationException: function: inline 2 UnsupportedOperationException: function: map_from_arrays 2 UnsupportedOperationException: function: sec 2 UnsupportedOperationException: function: shiftrightunsigned 2 UnsupportedOperationException: function: timestamp_seconds 2 UnsupportedOperationException: handle analyze same semantics 2 UnsupportedOperationException: pivot 2 UnsupportedOperationException: position with 3 arguments is not supported yet 2 UnsupportedOperationException: rebalance partitioning by expression 2 UnsupportedOperationException: tail 2 UnsupportedOperationException: unknown aggregate function: collect_set 2 UnsupportedOperationException: unresolved regex 2 UnsupportedOperationException: unsupported data source format: Some("orc") 2 UnsupportedOperationException: user defined data type should only exist in a field 2 handle artifact statuses 2 received metadata size exceeds hard limit (19620 vs. 16384); :status:42B content-type:60B grpc-stat... 1 AnalysisException: Cannot cast string 'abc' to value of Float64 type 1 AnalysisException: Cannot cast value 'abc' to value of Boolean type 1 AnalysisException: Error during planning: Execution error: User-defined coercion failed with Interna... 1 AnalysisException: Error during planning: Inconsistent data type across values list at row 1 column ... 1 AnalysisException: Error during planning: No function matches the given name and argument types 'NTH... 1 AnalysisException: Error during planning: The array_concat function can only accept list as the args... 1 AnalysisException: Error during planning: cannot resolve attribute: ObjectName([Identifier("a"), Ide... 1 AnalysisException: Error during planning: cannot resolve attribute: ObjectName([Identifier("id")]) 1 AnalysisException: Error during planning: three values expected: [Alias(Alias { expr: Column(Column ... 1 AnalysisException: Error during planning: three values expected: [Literal(Int32(1)), Literal(Int32(3... 1 AnalysisException: Error during planning: two values expected: [Alias(Alias { expr: Column(Column { ... 1 AnalysisException: Error during planning: view not found: tab2 1 AnalysisException: Execution error: 'Utf8("1970-01-01 00:00:00") = CAST(#1 AS Utf8) AS dt AS (1970-0... 1 AnalysisException: Execution error: 'Utf8("2012-02-02 02:02:02") = CAST(?table?.#0 AS Utf8) AS a AS ... 1 AnalysisException: Execution error: 'Utf8("INTERVAL '0 00:00:00.000123' DAY TO SECOND") = CAST(#1 AS... 1 AnalysisException: Execution error: Error parsing timestamp from '2023-01-01' using format '%d-%m-%Y... 1 AnalysisException: Execution error: The UPPER function can only accept strings, but got Int64. 1 AnalysisException: Execution error: Unable to find factory for TEXT 1 AnalysisException: Execution error: array_reverse does not support type 'Utf8'. 1 AnalysisException: Invalid or Unsupported Configuration: could not find config namespace for key "ig... 1 AnalysisException: Invalid or Unsupported Configuration: could not find config namespace for key "li... 1 AnalysisException: No field named tbl."#2". Valid fields are tbl."#3". 1 AnalysisException: Schema contains duplicate unqualified field name "#0" 1 AssertionError: "2000000" does not match "Internal error: raise_error expects a single UTF-8 string ... (+1) 1 AssertionError: "Database 'memory:13d52b96-a8d6-4830-a76d-4b86534f46c5' dropped." does not match "in... (+1) 1 AssertionError: "Database 'memory:d0539c3d-adee-4482-9201-bd4ba567a12d' dropped." does not match "in... 1 AssertionError: "Exception thrown when converting pandas.Series" does not match " 1 AssertionError: "Exception thrown when converting pandas.Series" does not match "expected value at l... 1 AssertionError: "PickleException" does not match " 1 AssertionError: "UDTF_ARROW_TYPE_CAST_ERROR" does not match " 1 AssertionError: "[('a', 'b')]" != "{'a': 'b'}" 1 AssertionError: "foobar" does not match "Internal error: raise_error expects a single UTF-8 string a... 1 AssertionError: "timestamp values are not equal (timestamp='1969-01-01 09:01:01+08:00': data[0][1]='... 1 AssertionError: '+---[17 chars]-----+\n| x|\n+--------[132 chars]-+\n' != '+-... 1 AssertionError: 2 != 3 1 AssertionError: AnalysisException not raised by 1 AssertionError: BinaryType() != NullType() 1 AssertionError: Exception not raised 1 AssertionError: Lists differ: [Row([14 chars] _c1=25, _c2='I am Hyukjin\n\nI love Spark!'),[86 chars... 1 AssertionError: Lists differ: [Row([22 chars]e(2018, 12, 31, 16, 0), aware=datetime.datetim[16 chars... 1 AssertionError: Lists differ: [Row([49 chars] 1), internal_value=-31532339000000000), Row(i[225 char... 1 AssertionError: Lists differ: [Row(key='0'), Row(key='1'), Row(key='10'), Ro[1439 chars]99')] != [Ro... 1 AssertionError: Lists differ: [Row(ln(id)=-inf, ln(id)=-inf, struct(id, name)=Row(i[1232 chars]9'))]... 1 AssertionError: Lists differ: [Row(name='Andy', age=30), Row(name='Justin', [34 chars]one)] != [Row(... 1 AssertionError: Row(point='[1.0, 2.0]', pypoint='[3.0, 4.0]') != Row(point='(1.0, 2.0)', pypoint='[3... 1 AssertionError: Row(res="[('personal', [('name', 'John'), ('city', 'New York')])]") != Row(res="{'pe... 1 AssertionError: StorageLevel(False, True, True, False, 1) != StorageLevel(False, False, False, False... 1 AssertionError: Struc[31 chars]stampNTZType(), True), StructField('val', Inte[13 chars]ue)]) != Stru... 1 AssertionError: Struc[32 chars]e(), False), StructField('b', DoubleType(), Fa[158 chars]ue)]) != Str... 1 AssertionError: Struc[40 chars]ue), StructField('val', ArrayType(DoubleType(), False), True)]) != St... 1 AssertionError: YearMonthIntervalType(0, 1) != YearMonthIntervalType(0, 0) 1 AssertionError: [1.0, 2.0] != ExamplePoint(1.0,2.0) 1 AssertionError: {} != {'max_age': 5} 1 AttributeError: 'DataFrame' object has no attribute '_ipython_key_completions_' 1 AttributeError: 'DataFrame' object has no attribute '_joinAsOf' (+1) 1 FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp3x2ehrh9' (+1) 1 FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpyxyr9vy8' 1 IllegalArgumentException: 83140 is too large to store in a Decimal128 of precision 4. Max is 9999 1 IllegalArgumentException: invalid argument: from_unixtime with format is not supported yet 1 IllegalArgumentException: invalid argument: sql parser error: Expected: (, found: AS at Line: 1, Col... 1 IllegalArgumentException: invalid argument: sql parser error: Expected: (, found: EOF 1 IllegalArgumentException: number of columns(0) must match number of fields(1) in schema 1 KeyError: 'max' 1 PySparkNotImplementedError: [NOT_IMPLEMENTED] foreach() is not implemented. 1 PySparkNotImplementedError: [NOT_IMPLEMENTED] foreachPartition() is not implemented. 1 PySparkNotImplementedError: [NOT_IMPLEMENTED] localCheckpoint() is not implemented. 1 PySparkNotImplementedError: [NOT_IMPLEMENTED] sparkContext() is not implemented. 1 PySparkNotImplementedError: [NOT_IMPLEMENTED] toJSON() is not implemented. 1 PythonException: ArrowTypeError: ("Expected dict key of type str or bytes, got 'int'", 'Conversion ... 1 PythonException: AttributeError: 'NoneType' object has no attribute 'partitionId' 1 PythonException: AttributeError: 'list' object has no attribute 'x' 1 PythonException: AttributeError: 'list' object has no attribute 'y' 1 PythonException: TypeError: 'NoneType' object is not subscriptable 1 QueryExecutionException: Compute error: concat requires input of at least one array 1 QueryExecutionException: Json error: Not valid JSON: EOF while parsing a list at line 1 column 1 1 QueryExecutionException: Json error: Not valid JSON: expected value at line 1 column 2 1 SparkRuntimeException: External error: Arrow error: Invalid argument error: column types must match ... 1 SparkRuntimeException: External error: This feature is not implemented: physical plan is not yet imp... 1 SparkRuntimeException: Internal error: Failed to decode from base64: Invalid padding. 1 SparkRuntimeException: Internal error: UDF returned a different number of rows than expected. Expect... 1 SparkRuntimeException: Optimizer rule 'optimize_projections' failed 1 SparkRuntimeException: expand_wildcard_rule (+1) 1 UnsupportedOperationException: Aggregate can not be used as a sliding accumulator because `retract_b... (+1) 1 UnsupportedOperationException: Aggregate can not be used as a sliding accumulator because `retract_b... (+1) 1 UnsupportedOperationException: Aggregate can not be used as a sliding accumulator because `retract_b... (+1) 1 UnsupportedOperationException: Aggregate can not be used as a sliding accumulator because `retract_b... 1 UnsupportedOperationException: Insert into not implemented for this table (+1) 1 UnsupportedOperationException: Physical plan does not support logical expression AggregateFunction(A... 1 UnsupportedOperationException: PlanNode::IsCached 1 UnsupportedOperationException: SQL show functions 1 UnsupportedOperationException: Unsupported statement: SHOW DATABASES 1 UnsupportedOperationException: bucketing 1 UnsupportedOperationException: call function 1 UnsupportedOperationException: deduplicate within watermark 1 UnsupportedOperationException: function exists 1 UnsupportedOperationException: function: aes_decrypt 1 UnsupportedOperationException: function: aes_encrypt 1 UnsupportedOperationException: function: array_insert 1 UnsupportedOperationException: function: array_sort 1 UnsupportedOperationException: function: arrays_zip 1 UnsupportedOperationException: function: bin 1 UnsupportedOperationException: function: bit_count 1 UnsupportedOperationException: function: bit_get 1 UnsupportedOperationException: function: bitmap_bucket_number 1 UnsupportedOperationException: function: bitmap_count 1 UnsupportedOperationException: function: bround 1 UnsupportedOperationException: function: conv 1 UnsupportedOperationException: function: convert_timezone 1 UnsupportedOperationException: function: csc 1 UnsupportedOperationException: function: date_from_unix_date 1 UnsupportedOperationException: function: dayofmonth 1 UnsupportedOperationException: function: dayofyear 1 UnsupportedOperationException: function: decode 1 UnsupportedOperationException: function: e 1 UnsupportedOperationException: function: elt 1 UnsupportedOperationException: function: format_string 1 UnsupportedOperationException: function: from_utc_timestamp 1 UnsupportedOperationException: function: getbit 1 UnsupportedOperationException: function: hour 1 UnsupportedOperationException: function: inline_outer 1 UnsupportedOperationException: function: java_method 1 UnsupportedOperationException: function: json_object_keys 1 UnsupportedOperationException: function: json_tuple 1 UnsupportedOperationException: function: last_day 1 UnsupportedOperationException: function: localtimestamp 1 UnsupportedOperationException: function: make_dt_interval 1 UnsupportedOperationException: function: make_interval 1 UnsupportedOperationException: function: make_timestamp 1 UnsupportedOperationException: function: make_timestamp_ltz 1 UnsupportedOperationException: function: make_timestamp_ntz 1 UnsupportedOperationException: function: make_ym_interval 1 UnsupportedOperationException: function: map_concat 1 UnsupportedOperationException: function: map_from_entries 1 UnsupportedOperationException: function: mask 1 UnsupportedOperationException: function: minute 1 UnsupportedOperationException: function: months_between 1 UnsupportedOperationException: function: next_day 1 UnsupportedOperationException: function: parse_url 1 UnsupportedOperationException: function: printf 1 UnsupportedOperationException: function: quarter 1 UnsupportedOperationException: function: reflect 1 UnsupportedOperationException: function: regexp_count 1 UnsupportedOperationException: function: regexp_extract 1 UnsupportedOperationException: function: regexp_extract_all 1 UnsupportedOperationException: function: regexp_instr 1 UnsupportedOperationException: function: regexp_substr 1 UnsupportedOperationException: function: schema_of_csv 1 UnsupportedOperationException: function: schema_of_json 1 UnsupportedOperationException: function: second 1 UnsupportedOperationException: function: sentences 1 UnsupportedOperationException: function: session_window 1 UnsupportedOperationException: function: sha 1 UnsupportedOperationException: function: sha1 1 UnsupportedOperationException: function: soundex 1 UnsupportedOperationException: function: spark_partition_id 1 UnsupportedOperationException: function: split 1 UnsupportedOperationException: function: stack 1 UnsupportedOperationException: function: str_to_map 1 UnsupportedOperationException: function: timestamp_micros 1 UnsupportedOperationException: function: timestamp_millis 1 UnsupportedOperationException: function: to_char 1 UnsupportedOperationException: function: to_csv 1 UnsupportedOperationException: function: to_json 1 UnsupportedOperationException: function: to_number 1 UnsupportedOperationException: function: to_unix_timestamp 1 UnsupportedOperationException: function: to_utc_timestamp 1 UnsupportedOperationException: function: to_varchar 1 UnsupportedOperationException: function: try_add 1 UnsupportedOperationException: function: try_aes_decrypt 1 UnsupportedOperationException: function: try_divide 1 UnsupportedOperationException: function: try_element_at 1 UnsupportedOperationException: function: try_multiply 1 UnsupportedOperationException: function: try_subtract 1 UnsupportedOperationException: function: try_to_binary 1 UnsupportedOperationException: function: try_to_number 1 UnsupportedOperationException: function: try_to_timestamp 1 UnsupportedOperationException: function: typeof 1 UnsupportedOperationException: function: unix_date 1 UnsupportedOperationException: function: unix_micros 1 UnsupportedOperationException: function: unix_millis 1 UnsupportedOperationException: function: unix_seconds 1 UnsupportedOperationException: function: url_decode 1 UnsupportedOperationException: function: url_encode 1 UnsupportedOperationException: function: weekday 1 UnsupportedOperationException: function: weekofyear 1 UnsupportedOperationException: function: width_bucket 1 UnsupportedOperationException: function: xpath 1 UnsupportedOperationException: function: xpath_boolean 1 UnsupportedOperationException: function: xpath_double 1 UnsupportedOperationException: function: xpath_float 1 UnsupportedOperationException: function: xpath_int 1 UnsupportedOperationException: function: xpath_long 1 UnsupportedOperationException: function: xpath_number 1 UnsupportedOperationException: function: xpath_short 1 UnsupportedOperationException: function: xpath_string 1 UnsupportedOperationException: handle analyze is local 1 UnsupportedOperationException: handle analyze semantic hash 1 UnsupportedOperationException: list functions 1 UnsupportedOperationException: summary 1 UnsupportedOperationException: unknown aggregate function: bitmap_or_agg 1 UnsupportedOperationException: unknown aggregate function: count_if 1 UnsupportedOperationException: unknown aggregate function: count_min_sketch 1 UnsupportedOperationException: unknown aggregate function: grouping_id 1 UnsupportedOperationException: unknown aggregate function: histogram_numeric 1 UnsupportedOperationException: unknown aggregate function: percentile 1 UnsupportedOperationException: unknown aggregate function: try_avg 1 UnsupportedOperationException: unknown aggregate function: try_sum 1 UnsupportedOperationException: unknown function: distributed_sequence_id 1 UnsupportedOperationException: unknown function: product 1 ValueError: The column label 'struct' is not unique. 1 internal error: unknown attribute in plan 237: a (-1) 0 AssertionError: "Database 'memory:722d1766-7cc5-453b-ac9f-ffe3d3dbdd00' dropped." does not match "in... (-1) 0 AssertionError: "Database 'memory:fa16e383-7d5c-4185-87cb-80e4b23926ed' dropped." does not match "in... (-1) 0 FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp6xtblmto' (-1) 0 FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpns2w629x' (-1) 0 UnsupportedOperationException: Aggregate can not be used as a sliding accumulator because `retract_b... (-1) 0 UnsupportedOperationException: Aggregate can not be used as a sliding accumulator because `retract_b... (-1) 0 UnsupportedOperationException: Aggregate can not be used as a sliding accumulator because `retract_b... (-1) 0 UnsupportedOperationException: Aggregate can not be used as a sliding accumulator because `retract_b... (-1) 0 UnsupportedOperationException: Physical plan does not support logical expression AggregateFunction(A... ```
Passed Tests Diff ```diff --- before 2024-10-25 03:53:59.300671898 +0000 +++ after 2024-10-25 03:53:59.432672310 +0000 @@ -81 +80,0 @@ -.venvs/test/lib/python3.11/site-packages/pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.withColumns ```