Open matthewmturner opened 2 years ago
@seddonm1 @yjshen @houqp FYI - in case you have thoughts on this.
actually, im not sure how well those parameters in register_object_store
will generalize to other ObjectStore
besides s3. so now im not sure if a general function like that could be used.
maybe my objective could be achieved with some command line options instead. for example:
Default credentials
$ datafusion-cli --object-store s3
Minio
$ datafusion-cli --object-store s3 --access-key KEY --secret-key ABC --provider PROVIDER --endpoint ENDPOINT
@houqp @yjshen @seddonm1 do you have a view on whether ObjectStore
registration can be done via SQL or if this should be part of datafusion-cli?
I think it can be done through both because secret key credentials and endpoint can be provided through environment variables as well. In this case, user will only need to provide the s3 path in the SQL query.
@matthewmturner any progress on this one? If you are not working on it still, I would like to take a stab at it
I think this repo is largely deprecated in favour of https://github.com/apache/arrow-rs/tree/master/object_store
@matthewmturner any progress on this one? If you are not working on it still, I would like to take a stab at it
@turbo1912 Haven't been able to work on this, go for it!
Is your feature request related to a problem or challenge? Please describe what you are trying to do. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] (This section helps Arrow developers understand the context and why for this feature, in addition to the what)
I am working towards making datafusion-cli a powerful tool to use locally for doing ad-hoc data analysis. The first step for that was #1875 which enables defining a local "database" that runs on startup with a
.datafusionrc
file. As a second step, I would like to be able to connect to object stores, such as S3, just from SQL. That will of course require adding s3 as a feature to datafusion-cli but that feature is useless unlessObjectStores
can be registered. Below is the current behaviour:Describe the solution you'd like A clear and concise description of what you want to happen.
I would like to be able to register a
ObjectStore
just from SQL. Given thatObjectStore
is a DataFusion concept I was thinking that we can add a function such asregister_object_store
, rather than having a SQL statement.So it would look something like
Default credentials
Minio
Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.
Additional context Add any other context or screenshots about the feature request here.