apache / iceberg-go

Apache Iceberg - Go
https://iceberg.apache.org/
Apache License 2.0
143 stars 35 forks source link

Implement Other Filesystems Using Go CDK #92

Open srilman opened 5 months ago

srilman commented 5 months ago

Feature Request / Improvement

Can we add the Go CDK as a File IO option, particularly for remote write support?

For context, the Go CDK (https://gocloud.dev/) is a semi-official interface for interactive with various cloud service solutions, providing common APIs. For the purposes of this library, the blob module (https://pkg.go.dev/gocloud.dev@v0.37.0/blob#pkg-overview) provides the following interfaces for object stores:

It supports the following storage backends:

I find that this is preferable to other options like Acero because it is maintained and there are releases more often. Plus, it seems to be tied closer to the Go team.

srilman commented 5 months ago

In addition, this library is one of the only ones available that supports AWS SDK v2 with write support. The library we are using right now, S3IOFs doesn't for example. And other libraries I've looked at (like VFS) only support V1.

srilman commented 5 months ago

@zeroshade I think I have something working on my end, once I get a green-light happy to open a PR.

zeroshade commented 5 months ago

@srilman I'd be happy to review a PR for this, particularly if it simplifies the file io stuff while getting us more storage back ends. I'm currently traveling for a conference, but I'll be able to review the PR next week or the week after. Thanks!