risingwavelabs / arrow-udf

An User-Defined Function Framework for Apache Arrow.
Apache License 2.0
33 stars 5 forks source link

support user-defined types #2

Closed wangrunji0408 closed 3 months ago

wangrunji0408 commented 4 months ago

Sometimes user want to return a complex struct in their functions. Currently user has to encode the struct type in the function signature, and return struct fields as a tuple. For instance:

#[function("key_value(varchar) -> struct<key:varchar,value:varchar>")]
fn key_value(kv: &str) -> Option<(&str, &str)> {
    kv.split_once('=')
}

This inline approach only works well for simple struct types. It will soon run into trouble when the struct type becomes very complex.

Therefore, we should allow users define their types separately using struct. We will introduce a derive macro to reflect the code and generate code for retrieving from and storing into arrow arrays.

The designed user interface would be like:

#[derive(StructType)]
struct KeyValue<'a> {
    key: &'a str,
    value: &'a str,
}

#[function("key_value(varchar) -> KeyValue")]
fn key_value(kv: &str) -> Option<KeyValue<'_>> {
    let (key, value) = kv.split_once('=')?;
    Some(KeyValue { key, value })
}

cc @TennyZhuang

wangrunji0408 commented 3 months ago

implemented in arrow-udf v0.2