kzk / webhdfs

Ruby client for Hadoop WebHDFS
Other
81 stars 46 forks source link

URI escape special characters in HDFS path #17

Closed chaiken-verticloud closed 9 years ago

chaiken-verticloud commented 9 years ago

This pull request replaces pull #16. It fixes the issue with using the obsolete URI.escape method, raised by @tagomoris .

We found that webhdfs failed when trying to stat the following files on one of Altiscale's clusters:

/hive/container_fact/${InputDir}
/hive/container_fact/${InputDir}/system=${sys}
/hive/container_fact/${InputDir}/system=${sys}/date=${day}
/hive/container_fact/${InputDir}/system=${sys}/date=${day}/data

The small change in this pull request solves this problem.

CGI.escape, which is sometimes used as an alternative to URI.escape, is not appropriate for this purpose because CGI.escape converts '/' to '%2F' .

tagomoris commented 9 years ago

LGTM! Of course, tests make this patch better than now. But i know that the original code doesn't have any tests. :(

tagomoris commented 9 years ago

Thank you for contribution!

tagomoris commented 9 years ago

Released as version 0.7.0.

chaiken-verticloud commented 9 years ago

Thanks for the merge, @tagomoris .

I'm a big believer in tests! For example, hdfsutils (which uses webhdfs) is a very new project, but already has quite a few unit tests: https://github.com/Altiscale/hdfsutils

This repo provides instructions for how to run the unit tests, and requires that pull requests from contributors include appropriate tests.