kzk / webhdfs

Ruby client for Hadoop WebHDFS
Other
81 stars 46 forks source link

HA Namenode suppport? #30

Open sherzberg opened 8 years ago

sherzberg commented 8 years ago

We have two namenodes for high availability and get StandbyException when using the non-active namenode, which makes sense.

WebHDFS::IOError: {"RemoteException":{"exception":"StandbyException","javaClassName":"org.apache.hadoop.ipc.StandbyException","message":"Operation category READ is not supported in state standby"}}
    from /usr/local/lib/ruby/gems/2.3.0/gems/webhdfs-0.8.0/lib/webhdfs/client_v1.rb:401:in `request'
    from /usr/local/lib/ruby/gems/2.3.0/gems/webhdfs-0.8.0/lib/webhdfs/client_v1.rb:275:in `operate_requests'
    from /usr/local/lib/ruby/gems/2.3.0/gems/webhdfs-0.8.0/lib/webhdfs/client_v1.rb:138:in `list'
    from (irb):5
    from /usr/local/bin/irb:11:in `<main>'

However, is it up to the client to figure which is the active namenode to use as the host in this library? Is there a way to specify multiple host address for this situation?

sherzberg commented 8 years ago

Just for reference, this is what we are using webhdfs for and a PR for getting around this issue: https://github.com/logstash-plugins/logstash-output-webhdfs/pull/18

tagomoris commented 8 years ago

There's no way to specify 2 or more host addresses right now. Pull requests are welcome :)

infinite-monkeys commented 4 months ago

Another solution for people who already have something like HAproxy set up is to point webhdfs to the HAproxy, and have HAproxy monitor the two namenodes to route to the active.