apache / datafusion

Apache DataFusion SQL Query Engine
https://datafusion.apache.org/
Apache License 2.0
6.39k stars 1.21k forks source link

ListingTableUrl should allow direct construction #12581

Open rtyler opened 2 months ago

rtyler commented 2 months ago

Is your feature request related to a problem or challenge?

When investigating delta-io/delta-rs#2834 I discovered that the crux of the problem was that our (delta-rs) code has Urls coming from object_store, which are then passed back into Datafusion via read_parquet which does a ListingTableUrl::parse on those strings, thereby turning URLs into strings and "corrupting" them in the process.

Describe the solution you'd like

I think a ListingTableUrl constructor or means of construction without a parse would suffice for our use-case. On the delta-rs side we have effectively all the necessary information to construct a ListingTableUrl (url, scheme, object store, etc) but we are not able to build that ourselves and therefore have to jump through a couple hoops to get ListingTableUrl::parse to produce the right thing.

Describe alternatives you've considered

Right now I'm passing str::replcae(meta.location.as_ref(), "%", "%25") to read_parquet and it feels yucky :laughing: :naus

Additional context

No response

alamb commented 2 months ago

I agree -- being able to take a direct ListingTableUrl seems like a good idea to me

OussamaSaoudi commented 2 months ago

take