Closed goldmedal closed 2 weeks ago
take
Thank you @goldmedal
Hi @alamb,
I created a draft PR for this issue in #11035. After some experiments, I think passing only ObjectStore
isn't enough for creating a TableProvider at runtime. We need to build the schema from a full SessionState.
Although there are many issues that need to be fixed, could you take a look at this PR to check if this idea makes sense when you're available?
Thanks.
I have finished the PR but I think there're two follow-up issues needed to be filed:
Is your feature request related to a problem or challenge?
I had some discussions with @alamb about supporting a dynamic file data source (
select ... from 'select .. from 'data.parquet'
like #4805) in the core, as mentioned in https://github.com/apache/datafusion/issues/4850#issuecomment-2142190951. However, we found that it's not a good idea to move so many dependencies (e.g., S3-related) to the core crate after #10745.Describe the solution you'd like
As @alamb proposed in https://github.com/apache/datafusion/pull/10745#issuecomment-2175817937, we can focus first on the logic that interprets table names as potential object store locations. Implement a struct
DynamicTableProvider
and a trait calledUrlLookup
to getObjectStore
at runtime.By default,
DynamicTableProvider
only supports querying local file paths likefile:///...
. The implementation of dynamic file queries in datafusion-cli might also be based onDynamicTableProvider
but will load the common object storage dependency by default.Describe alternatives you've considered
No response
Additional context
No response