datafusion-contrib / datafusion-objectstore-s3

S3 as an ObjectStore for DataFusion
Apache License 2.0
59 stars 13 forks source link

Doc-test for `register_object_store` not working #39

Closed matthewmturner closed 2 years ago

matthewmturner commented 2 years ago

I cant get doc tests to pass using the object_store returned from ctx.object_store method (i get file not found error). Does anything seem off to you @seddonm1?

_Originally posted by @matthewmturner in https://github.com/datafusion-contrib/datafusion-objectstore-s3/pull/38#discussion_r800304204_

seddonm1 commented 2 years ago

it's not particularly beautiful but it does demonstrate how it works

seddonm1 commented 2 years ago

alternatively we move this logic in to the actual S3FileSystem and make sure all input URIs are s3::// prefixed

matthewmturner commented 2 years ago

alternatively we move this logic in to the actual S3FileSystem and make sure all input URIs are s3::// prefixed

That sounds reasonable. Just one question.

Does that mean users would need to input file paths with a full "s3://" + "BUCKET" prefix before the actual table path?

seddonm1 commented 2 years ago

yes so s3://data/alltypes_plain.snappy.parquet indicates scheme s3://, bucket data and file alltypes_plain.snappy.parquet. The reason your previous code was failing as ctx.object_store(uri)? matches on the scheme s3:// (including ://).

matthewmturner commented 2 years ago

Noted - thank you for explanation.

matthewmturner commented 2 years ago

@seddonm1 actually since only the ObjectStoreRegistry is using the uri - do we need to add anything in S3FileSystem? For example, we dont currently have "s3://" in any paths that we use for tests and have no issues. I think I just misunderstood how to use the ctx.object_store API.