apache / datafusion-ballista

Apache DataFusion Ballista Distributed Query Engine
https://datafusion.apache.org/ballista
Apache License 2.0
1.39k stars 181 forks source link

[Ballista] Fix regression in `roundtrip_logical_plan_custom_ctx` test #481

Open andygrove opened 2 years ago

andygrove commented 2 years ago

Describe the bug PR https://github.com/apache/arrow-datafusion/pull/2537 fixed an issue where the csv scan methods were using the full URI instead of the path when serializing csv scans, which was not consistent with the way other scans worked (parquet, avro, json). Making this consistent led to a regression in roundtrip_logical_plan_custom_ctx so the test was ignored for now.

We should re-enable this test.

To Reproduce Run roundtrip_logical_plan_custom_ctx.

Expected behavior Functionality should be consistent between file types.

Additional context None

andygrove commented 2 years ago

@thinkharderdev @mingmwang I could use some help with this if you have time to look.

thinkharderdev commented 2 years ago

Yeah, the issue here is that the serde logic uses the file scheme to resolve the ObjectStore. So we were relying on the test scheme to resolve the TestObjectStore. Why would we not serialize the entire file URI?