ArroyoSystems / arroyo

Distributed stream processing engine in Rust
https://arroyo.dev
Apache License 2.0
3.8k stars 220 forks source link

Async UDFs on subqueries produces runtime error #779

Closed mwylde closed 2 weeks ago

mwylde commented 2 weeks ago

A query like this:

select new_udf(out.seconds) from (
select timestamp_struct as out
from (
    select
        header.ts as timestamp_struct
    from nested2
));

with any async UDF, for example

use arroyo_udf_plugin::udf;

#[udf]
async fn new_udf(x: i64) -> i64 {
    x
}

Produces a runtime error:

2024-11-01T23:04:54.758632Z  INFO arroyo_controller::states: state transition job_id="job_q5WOEkQvWV" from="Scheduling" to="Running" duration_ms=167
2024-11-01T23:04:56.711565Z  INFO arroyo_api::jobs: Subscribed to output
2024-11-01T23:05:00.698013Z ERROR arroyo_server_common: panicked at crates/arroyo-worker/src/arrow/async_udf.rs:299:22:
called `Result::unwrap()` on an `Err` value: Execution("get indexed field seconds not found in struct") panic.file="crates/arroyo-worker/src/arrow/async_udf.rs" panic.line=299 panic.column=22