apache / arrow

Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
https://arrow.apache.org/
Apache License 2.0
14.35k stars 3.49k forks source link

[C++] Executing Substrait plan containing round function causes segfault or error #34310

Closed thisisnic closed 1 year ago

thisisnic commented 1 year ago

Describe the bug, including details regarding any error messages, version, and platform.

I'm writing bindings for the R Substrait producer, and when I try to run a plan which uses the Substrait round() function, I get errors and segfaults.

If I specify the output type in the binding, I get a segfault:

# remotes::install_github("voltrondata/substrait-r")
library(substrait, warn.conflicts = FALSE)
library(dplyr, warn.conflicts = FALSE)

compiler <- tibble::tibble(x = c(1, 2.3, 3.4, 4.5)) %>%
  arrow_substrait_compiler()

compiler$.fns[["round"]] <- function(x, digits = 0) {
  substrait_call(
    "rounding.round",
    x,
    digits,
    .output_type = substrait_fp64(),
    .options = list(
      substrait$FunctionOption$create(name = "rounding",preference = "TIE_TO_EVEN"))
    )
}

compiler2 <- compiler %>%
  substrait_project(y = round(x)) 

compiler2$plan() |> 
  as.raw() |> 
  arrow::buffer() |> 
  arrow:::substrait__internal__SubstraitToJSON() |> 
  jsonlite::prettify(indent = 2)
#> {
#>   "extensionUris": [
#>     {
#>       "extensionUriAnchor": 1,
#>       "uri": "https://github.com/substrait-io/substrait/blob/main/extensions/functions_arithmetic.yaml"
#>     },
#>     {
#>       "extensionUriAnchor": 2,
#>       "uri": "https://github.com/substrait-io/substrait/blob/main/extensions/functions_comparison.yaml"
#>     }
#>   ],
#>   "extensions": [
#>     {
#>       "extensionFunction": {
#>         "functionAnchor": 2,
#>         "name": "round"
#>       }
#>     }
#>   ],
#>   "relations": [
#>     {
#>       "root": {
#>         "input": {
#>           "project": {
#>             "common": {
#> 
#>             },
#>             "input": {
#>               "read": {
#>                 "baseSchema": {
#>                   "names": [
#>                     "x"
#>                   ],
#>                   "struct": {
#>                     "types": [
#>                       {
#>                         "fp64": {
#>                           "nullability": "NULLABILITY_NULLABLE"
#>                         }
#>                       }
#>                     ]
#>                   }
#>                 },
#>                 "namedTable": {
#>                   "names": [
#>                     "named_table_1"
#>                   ]
#>                 }
#>               }
#>             },
#>             "expressions": [
#>               {
#>                 "scalarFunction": {
#>                   "functionReference": 2,
#>                   "outputType": {
#>                     "fp64": {
#>                       "nullability": "NULLABILITY_NULLABLE"
#>                     }
#>                   },
#>                   "arguments": [
#>                     {
#>                       "value": {
#>                         "selection": {
#>                           "directReference": {
#>                             "structField": {
#> 
#>                             }
#>                           },
#>                           "rootReference": {
#> 
#>                           }
#>                         }
#>                       }
#>                     },
#>                     {
#>                       "value": {
#>                         "literal": {
#>                           "fp64": 0
#>                         }
#>                       }
#>                     }
#>                   ],
#>                   "options": [
#>                     {
#>                       "name": "rounding",
#>                       "preference": [
#>                         "TIE_TO_EVEN"
#>                       ]
#>                     }
#>                   ]
#>                 }
#>               }
#>             ]
#>           }
#>         },
#>         "names": [
#>           "x",
#>           "y"
#>         ]
#>       }
#>     }
#>   ]
#> }
#> 

Here's the GDB output when I run compiler2 %>% collect() (i.e. which actually runs the plan)

> compiler2 %>% collect()
[New Thread 0x7fffe9ffe640 (LWP 108889)]
/home/nic2/arrow/cpp/src/arrow/result.cc:28: ValueOrDie called on an error: NotImplemented: Function 'round_binary' has no kernel matching input types (double, float)
/home/nic2/arrow/cpp/src/arrow/compute/exec/expression.cc:548  call.function->DispatchBest(&types)
/home/nic2/arrow_installed_version/lib/libarrow.so.1200(_ZN5arrow4util7CerrLog14PrintBackTraceEv+0x39)[0x7fffed68aa87]
/home/nic2/arrow_installed_version/lib/libarrow.so.1200(_ZN5arrow4util7CerrLogD1Ev+0x5f)[0x7fffed68a9f7]
/home/nic2/arrow_installed_version/lib/libarrow.so.1200(_ZN5arrow4util7CerrLogD0Ev+0x1c)[0x7fffed68aa1c]
/home/nic2/arrow_installed_version/lib/libarrow.so.1200(_ZN5arrow4util8ArrowLogD1Ev+0x4b)[0x7fffed68a81f]
/home/nic2/arrow_installed_version/lib/libarrow.so.1200(_ZN5arrow8internal14DieWithMessageERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x5c)[0x7fffed3eff2a]
/home/nic2/arrow_installed_version/lib/libarrow.so.1200(_ZN5arrow8internal17InvalidValueOrDieERKNS_6StatusE+0x8b)[0x7fffed3efff0]
/home/nic2/arrow_installed_version/lib/libarrow_substrait.so.1200(_ZNR5arrow6ResultINS_7compute10ExpressionEE10ValueOrDieEv+0x38)[0x7ffff1b44f38]
/home/nic2/arrow_installed_version/lib/libarrow_substrait.so.1200(_ZN5arrow6ResultINS_7compute10ExpressionEEptEv+0x1c)[0x7ffff1b3bbd0]
/home/nic2/arrow_installed_version/lib/libarrow_substrait.so.1200(_ZN5arrow6engine9FromProtoERKN9substrait3RelERKNS0_12ExtensionSetERKNS0_17ConversionOptionsE+0x29f0)[0x7ffff1b2d15c]
/home/nic2/arrow_installed_version/lib/libarrow_substrait.so.1200(+0x35d02c)[0x7ffff1b5d02c]
/home/nic2/arrow_installed_version/lib/libarrow_substrait.so.1200(_ZN5arrow6engine16DeserializePlansERKNS_6BufferERKSt8functionIFSt10shared_ptrINS_7compute16SinkNodeConsumerEEvEEPKNS0_19ExtensionIdRegistryEPNS0_12ExtensionSetERKNS0_17ConversionOptionsE+0x6d)[0x7ffff1b5d642]
/home/nic2/R/x86_64-pc-linux-gnu-library/4.1/arrow/libs/arrow.so(_Z22ExecPlan_run_substraitRKSt10shared_ptrIN5arrow7compute8ExecPlanEERKS_INS0_6BufferEE+0x158)[0x7ffff23ba0f8]
/home/nic2/R/x86_64-pc-linux-gnu-library/4.1/arrow/libs/arrow.so(_arrow_ExecPlan_run_substrait+0x85)[0x7ffff2383ac5]
/lib/libR.so(+0xf851a)[0x7ffff7af851a]
/lib/libR.so(+0x13c167)[0x7ffff7b3c167]
/lib/libR.so(Rf_eval+0x88)[0x7ffff7b4fd38]
/lib/libR.so(+0x151bef)[0x7ffff7b51bef]
/lib/libR.so(Rf_applyClosure+0x1a2)[0x7ffff7b52ae2]
/lib/libR.so(+0x13d5ee)[0x7ffff7b3d5ee]
/lib/libR.so(Rf_eval+0x88)[0x7ffff7b4fd38]
/lib/libR.so(+0x151bef)[0x7ffff7b51bef]
/lib/libR.so(Rf_applyClosure+0x1a2)[0x7ffff7b52ae2]
/lib/libR.so(+0x13d5ee)[0x7ffff7b3d5ee]
/lib/libR.so(Rf_eval+0x88)[0x7ffff7b4fd38]
/lib/libR.so(+0x151bef)[0x7ffff7b51bef]
/lib/libR.so(Rf_applyClosure+0x1a2)[0x7ffff7b52ae2]
/lib/libR.so(Rf_eval+0x2af)[0x7ffff7b4ff5f]
/lib/libR.so(+0x153863)[0x7ffff7b53863]
/lib/libR.so(Rf_eval+0x57b)[0x7ffff7b5022b]
/lib/libR.so(+0x151bef)[0x7ffff7b51bef]
/lib/libR.so(Rf_applyClosure+0x1a2)[0x7ffff7b52ae2]
/lib/libR.so(+0x13d5ee)[0x7ffff7b3d5ee]
/lib/libR.so(Rf_eval+0x88)[0x7ffff7b4fd38]
/lib/libR.so(+0x1507bc)[0x7ffff7b507bc]
/lib/libR.so(Rf_eval+0x5f8)[0x7ffff7b502a8]
/lib/libR.so(+0x1984f8)[0x7ffff7b984f8]
/lib/libR.so(+0x134c07)[0x7ffff7b34c07]
/lib/libR.so(Rf_eval+0x88)[0x7ffff7b4fd38]
/lib/libR.so(+0x151bef)[0x7ffff7b51bef]
/lib/libR.so(Rf_applyClosure+0x1a2)[0x7ffff7b52ae2]
/lib/libR.so(+0x13d5ee)[0x7ffff7b3d5ee]
/lib/libR.so(Rf_eval+0x88)[0x7ffff7b4fd38]
/lib/libR.so(+0x151bef)[0x7ffff7b51bef]
/lib/libR.so(Rf_applyClosure+0x1a2)[0x7ffff7b52ae2]
/lib/libR.so(+0x197aa4)[0x7ffff7b97aa4]
/lib/libR.so(+0x197d9f)[0x7ffff7b97d9f]
/lib/libR.so(+0x1982a7)[0x7ffff7b982a7]
/lib/libR.so(+0x134c07)[0x7ffff7b34c07]
/lib/libR.so(Rf_eval+0x88)[0x7ffff7b4fd38]
/lib/libR.so(+0x151bef)[0x7ffff7b51bef]
/lib/libR.so(Rf_applyClosure+0x1a2)[0x7ffff7b52ae2]
/lib/libR.so(Rf_eval+0x2af)[0x7ffff7b4ff5f]
/home/nic2/R/x86_64-pc-linux-gnu-library/4.1/magrittr/libs/magrittr.so(magrittr_pipe+0x542)[0x7ffff7316fe2]
/lib/libR.so(+0xf6a50)[0x7ffff7af6a50]
/lib/libR.so(+0x134e19)[0x7ffff7b34e19]
/lib/libR.so(Rf_eval+0x88)[0x7ffff7b4fd38]
/lib/libR.so(+0x151bef)[0x7ffff7b51bef]
/lib/libR.so(Rf_applyClosure+0x1a2)[0x7ffff7b52ae2]
/lib/libR.so(Rf_eval+0x2af)[0x7ffff7b4ff5f]
/lib/libR.so(Rf_ReplIteration+0x202)[0x7ffff7b84ce2]
/lib/libR.so(+0x185080)[0x7ffff7b85080]
/lib/libR.so(run_Rmainloop+0x50)[0x7ffff7b85140]
/usr/lib/R/bin/exec/R(main+0x1f)[0x55555555509f]
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7ffff7629d90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7ffff7629e40]
/usr/lib/R/bin/exec/R(_start+0x2e)[0x5555555550de]

Thread 1 "R" received signal SIGABRT, Aborted.
__pthread_kill_implementation (no_tid=0, signo=6, threadid=140737340704704) at ./nptl/pthread_kill.c:44
44  ./nptl/pthread_kill.c: No such file or directory.

If I remove the line which specifies the output type, I instead get the error:

> compiler2 %>% collect()
Error: NotImplemented: conversion to arrow::DataType from Substrait type 
/home/nic2/arrow/cpp/src/arrow/engine/substrait/expression_internal.cc:119  FromProto(scalar_fn.output_type(), ext_set, conversion_options)
/home/nic2/arrow/cpp/src/arrow/engine/substrait/expression_internal.cc:317  DecodeScalarFunction(function_id, scalar_fn, ext_set, conversion_options)
/home/nic2/arrow/cpp/src/arrow/engine/substrait/relation_internal.cc:548  FromProto(expr, ext_set, conversion_options)
/home/nic2/arrow/cpp/src/arrow/engine/substrait/serde.cc:157  FromProto(plan_rel.has_root() ? plan_rel.root().input() : plan_rel.rel(), ext_set, conversion_options)

Component(s)

C++

westonpace commented 1 year ago

I get errors and segfaults.

This should not segfault, even if given an invalid plan. Are you possibly calling ValueOrDie in R when running a plan? If not, this bears investigation on its own.

If I specify the output type in the binding, I get a segfault:

It is not obvious, but the only valid kernels for round are:

round(i8, i32)
round(i16, i32)
round(i32, i32)
round(i64, i32)
found(fp32, i32)
round(fp64, i32)

In other words, the s argument (how many digits) MUST be int32. It appears that you are passing in float.

If I remove the line which specifies the output type, I instead get the error:

The output type must always be specified. An error is correct behavior here.

thisisnic commented 1 year ago

Thanks for that @westonpace - I have now updated my function binding to cast (in R) the s to an integer, and now it's working as expected. I'm still getting segfaults when using the previous version - will have another look to check what's going on here.

thisisnic commented 1 year ago

Hmm, looks like these lines from R are responsible:

https://github.com/apache/arrow/blob/6a4bcb36c091fea07c03c57b2e31dd29f9846ac2/r/src/arrow_types.h#L124-L128

Will look into how to make this fail more gracefully - thanks for pointing me in the right direction!