Open mwylde opened 10 months ago
Hi @mwylde I also would like to give this one a shot. Could you guide me the details how to start? Perhaps could you tell me where the SQL planning is done? IIUC, the planning is delegated to datafusion?
@mwylde I'm not able to reproduce this specific panic (RelativeUrlWithoutBase) exactly, but I did notice a few related issues when trying to reproduce it using ghcr.io/arroyosystems/arroyo-single:0.10-dev:
1) Pipelines/previews succeed even if the path for filesystem source created via SQL does not exist. I'd expect there to be some sort of failure if the path does not exist.
2) Path "file:///" for filesystem source created with SQL panics during query execution with ERROR arroyo_server_common: panicked at crates/arroyo-connectors/src/filesystem/source.rs:69:17: could not get next path: Generic LocalFileSystem error: Unable to walk dir: File system loop found: /sys/class/vtconsole/vtcon0/subsystem points to an ancestor /sys/class/vtconsole panic.file="crates/arroyo-connectors/src/filesystem/source.rs" panic.line=69 panic.column=17
panicked at crates/arroyo-connectors/src/filesystem/source.rs:69:17: could not get next path: Generic s3 error: Couldn't find AWS credentials in environment, credentials file, or IAM role. panic.file="crates/arroyo-connectors/src/filesystem/source.rs" panic.line=69 panic.colum
panicked at crates/arroyo-worker/src/lib.rs:622:14: called
Result::unwrap()on an
Errvalue: SendError { .. } panic.file="crates/arroyo-worker/src/lib.rs" panic.line=622 panic.column=14
What do you think about running the same connection test() logic that is run when creating connectors in the UI when planning sources during the scheduling phase? If each connector properly implements the test() logic, it should solve all of the problems above.
For filesystem sources created via SQL, we do not validate them as part of the SQL planning process. This causes panics at runtime when the source is instantiated on the worker: