kzk / webhdfs

Ruby client for Hadoop WebHDFS
Other
81 stars 46 forks source link

Gracefully handle failover (WIP) #35

Closed blaenk closed 5 years ago

blaenk commented 5 years ago

WIP PR: This is functionally complete. All tests pass. However, it was mostly a quick-and-dirty 1:1 extraction from Nim, so now time permitting, the focus will be on making sure that things are as simple as they can reasonably be given that this now lives here inside this gem.


This PR is the result of extracting extra WebHDFS functionality from Nim so that other projects may benefit. In particular it integrates retry logic that aims to gracefully handle certain errors and conditions such as high availability failover, as well as adds or augments certain HDFS operations.

Tests

Tests were also migrated over and are all passing. They were written in rspec, while this project's original tests were written with the now-deprecated test_unit. We end up using both now to avoid rewriting things, but the tests seemed really trivial so they may be migrated over to keep things consistent and simpler.

Tests expect these environment variables:

Variable Purpose
API_HOST The JMX API hostname
DEFAULT_NAMENODE The default namenode hostname
TEST_DIR The path scope of where tests will be run
KERBEROS Whether kerberos authentication is required
KEYTAB_PATH The path to the keytab

For example:

$ KERBEROS=true \
  API_HOST='http://jmx.site.com' \
  DEFAULT_NAMENODE='hdfs.namenode.com' \
  KEYTAB_PATH=~/someone.keytab \
  TEST_DIR=/user/someone/test_dir/ \
  bundle exec rake test

Tasks

blaenk commented 5 years ago

Opened this on the wrong repo. This doesn't seem to be maintained anymore, so I'm closing, but feel free to reopen if you find use for it.