gaul / s3proxy

Access other storage backends via the S3 API
Apache License 2.0
1.72k stars 223 forks source link

HDFS storage backend #267

Open gaul opened 6 years ago

gaul commented 6 years ago

Allow S3 applications to use HDFS. jclouds has some long-bitrotted example of this:

https://github.com/jclouds/jclouds-examples/tree/master/blobstore-hdfs

There are a couple ways to do this, including using the Java bindings:

https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/fs/FileSystem.html

or the REST API:

https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/WebHDFS.html https://hadoop.apache.org/docs/current/hadoop-hdfs-httpfs/

gkalele commented 1 year ago

@gaul I see the jclouds hdfs example link is no longer working at HEAD but is visible at 1.8.1. This would be very useful backend for an S3 server to support. Any chance you are going to pick this up again anytime soon ?

gaul commented 1 year ago

I think it would be better to abandon the jclouds HDFS code and to write a small BlobStore implementation that calls the Hadoop Java bindings. I don't have the time to work on this myself but I think you could hack up a minimal implementation in a few days. Integrating this into S3Proxy is probably best done without proper jclouds support which raises questions about how the provider registration would work but this is something that should be solvable.