dask / hdfs3

A wrapper for libhdfs3 to interact with HDFS from Python
http://hdfs3.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
136 stars 40 forks source link

Add support for Namenode Federation and mountables #165

Closed remysaissy closed 6 years ago

remysaissy commented 6 years ago

Hi, in our Hadoop clusters, we use Federated namenodes with 3 namespaces (root, datasets and yarn).

The typical fs.defaultFS then points to viewfs://root and a set of fs.viewfs.mounttable. entries are available.

It would be nice to benefit from viewfs:// support in hdfs3.

martindurant commented 6 years ago

I seem to recall this coming up in issues somewhere before. I don't know anything about viewfs, and have no test system for it, so if passing the parameters corresponding to your setup doesn't do the right thing, I unfortunately don't think I can help. Typically, arrow's hdfs (native) has a better time of coping with a wide range of configuration options, with the restriction that you have the hadoop/java libraries available ont he node you intend to access the data from.

remysaissy commented 6 years ago

Ok, I checked with pyarrow and it handles it properly. Thanks!