colinmarc / impala-ruby

an impala client for ruby
MIT License
34 stars 22 forks source link

Add ability to connect with SASL/GSSAPI (kerberos). #17

Closed grisha closed 2 years ago

grisha commented 9 years ago

This patch provides ability to connect to a kerberos-enabled Impala. Tested against Impala 1.3.1 (CDH 5.0.3).

It does add the gssapi gem as a dependency, which might be a nuisance in a non-kerberos environment, not sure what to do about that.

colinmarc commented 9 years ago

omg amazing! I will review shortly

colinmarc commented 9 years ago

Left some initial comments, and I can add more later. Please add some tests, too - you can make them be skipped if there's not a server speaking kerberos available (see the way I make the integration tests skip if there's no impala).

colinmarc commented 9 years ago

Also, you can see the relevant test failures if you click through to travis.

grisha commented 9 years ago

Thanks @colinmarc , good comments, I'll address them within the next couple of weeks, and by then it should be production-tested a little better. The impala_sasl_client_transport.rb file is actually a copy of sasl_client_transport.rb from the rbhive gem verbatim, which I kerberized a little while ago. See my blog post for details. But then using sasl_client_transport.rb against Impala turned out issues, so I had to tweak it, and rename to impala_sasl_client_transport.rb so it doesn't clash with rbhive version. (We use both impala and rbhive gems in our apps).

I think this PR could stay unmerged for a while. For anyone who needs this functionality urgently (like I did) - take my word that it works, and I'll sepnd some time perfecting it.

The gssapi requirement issue is also tricky, because that requires the MIT kerberos libs to be installed or it won't compile, I wonder if it could be somehow made optional during gem installation.

As a side note, I've been using impyla as a reference (it supports kerberos, beeswax and HS2), to compare low-level TCP conversations that take place, and noticed that it uses a different method of running queries, e.g it calls get_default_configuration() in the beginning for some reason, then uses "query" rather than "executeAndWait". Impyla also now defaults to the HS2 (port 21050), rather than beeswax, even though beeswax IMHO is leaner and faster... Anyhow - I kinda wish the ruby community had something functionally similar to impyla....

colinmarc commented 9 years ago

Great, that all sounds good. Thanks again for taking the time to work on this!

HS2 has some nice advantages, so I've been working on a branch that adds support. Just need some time to actually finish it =)