delta-incubator / delta-kernel-rs

A native Delta implementation for integration with any query engine
Apache License 2.0
123 stars 33 forks source link

Support HDFS via hdfs-native package #266

Closed santosh-d3vpl3x closed 2 months ago

santosh-d3vpl3x commented 3 months ago

delta-rs recently got initial support for hdfs via this PR.

It would be great if we could do the same for delta-rs-kernel.

Duckdb recently introduced support for delta via kernel implementation but it can't be used with hdfs because of this missing integration.

Tagging @kimahriman to see if they can help out here!

nicklan commented 3 months ago

Yeah, seems like we could add this since https://github.com/datafusion-contrib/hdfs-native-object-store just makes hdfs look like an object_store, thanks!

Not sure when I will have time to work on this, but if someone wants to make a PR I'd be open to it, and I will find time at some point.

SchutteJan commented 3 months ago

@nicklan I would be interested in contributing to this, but I am completely new to this project.

Let me know if my understanding of the problem is correct, as I see it there are two ways to go about it:

santosh-d3vpl3x commented 3 months ago

2nd way would be the most straightforward and preferable I believe.

Kimahriman commented 2 months ago

Sorry GitHub randomly decided to stop sending me email notifications. delta-rs just uses a dyn ObjectStore so it was fairly easy to integrate. Haven't looked much at this repo yet to see how it handles object store but hopefully is straightforward!

Kimahriman commented 2 months ago

Ah it looks like object_store::parse_url_opts is just used directly so might be a little more work to integrate, delta-rs already has custom handling of schemes so it was a little more straightforward. Have to do some upfront parsing of the scheme before forwarding to parse_url_opts

SchutteJan commented 2 months ago

@nicklan I've marked my PR as ready, can you take a look?