cloudera / impyla

Python DB API 2.0 client for Impala and Hive (HiveServer2 protocol)
Apache License 2.0
727 stars 248 forks source link

SASL Error: no mechanism available: No worthy mechs found #149

Open policratus opened 8 years ago

policratus commented 8 years ago

When trying to connect to an Impala Server, the following error happens:

>>> from impala.dbapi import connect
>>> conn = connect(host='xxxxxxxxxx', port=21050, user='xxxxxx', password='xxxxxx', auth_mechanism='LDAP')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.6/site-packages/impala/dbapi.py", line 82, in connect
    auth_mechanism=auth_mechanism)
  File "/usr/local/lib/python2.6/site-packages/impala/hiveserver2.py", line 586, in connect
    transport.open()
  File "/usr/local/lib/python2.6/site-packages/thrift_sasl/__init__.py", line 72, in open
    message=("Could not start SASL: %s" % self.sasl.getError()))
thrift.transport.TTransport.TTransportException: Could not start SASL: Error in sasl_client_start (-4) SASL(-4): no mechanism available: No worthy mechs found

Even with all LDAP dependencies installed.

laserson commented 8 years ago

What's the output of running rpm -qa | grep cyrus?

policratus commented 8 years ago

@laserson The output is (yes, we're running the cluster on AWS):

cyrus-sasl-gssapi-2.1.23-13.16.amzn1.x86_64
cyrus-sasl-ldap-2.1.23-13.16.amzn1.x86_64
cyrus-sasl-ntlm-2.1.23-13.16.amzn1.x86_64
cyrus-sasl-2.1.23-13.16.amzn1.x86_64
cyrus-sasl-plain-2.1.23-13.16.amzn1.x86_64
cyrus-sasl-lib-2.1.23-13.16.amzn1.x86_64
cyrus-sasl-md5-2.1.23-13.16.amzn1.x86_64
cyrus-sasl-devel-2.1.23-13.16.amzn1.x86_64
cyrus-sasl-sql-2.1.23-13.16.amzn1.x86_64
richban commented 8 years ago

I have the same issue I'm running on WIN8 I'm using Python 2.7 Anaconda distribution and our Hadoop Cluster is secured with Kerberos so we are using auth_mechanism='GSSAPI' I installed python-sasl so I wonder if I am missing some other dependencies.

richban commented 8 years ago

@nforte-luizalabs that's sucks, so what did you do after? Have you tried something else? I am trying to figure out how to use Spark. I would like to know if it is possible to create a client with spark to Impala.

policratus commented 8 years ago

@richban We've tried all thinkable dependencies, without success.

laserson commented 8 years ago

@policratus Can you successfully access Impala through the impala-shell?

laserson commented 8 years ago

@richban What are you trying to accomplish? If you want to access database tables from the Hive MetaStore from Spark, you should be able to with SparkSQL.

richban commented 8 years ago

@laserson yes I can successfully run impala-shell. Well yes I would like to access the database tables and do some some SQL queries a put python on top of it. But I haven't figured out how to connect to the database using SparkSQL. In a simple way I want to connect to our hadoop cluster and retrieve some data (do some SQL stuff) and than use this data with python.

richban commented 8 years ago

@laserson the main goal was to connect to Impala with Impyla client so that we can run impala sql queries and give that data to python. This is the error we encounter

conn = connect(host='host_name', port=21050, auth_mechanism='GSSAPI', database='db_name', 
               user='user_name', password='pswd')
---------------------------------------------------------------------------
TTransportException                       Traceback (most recent call last)
<ipython-input-3-1d678c3c4501> in <module>()
      1 conn = connect(host='', port=21050, auth_mechanism='GSSAPI', database='', 
----> 2                user='', password='')

C:\Anaconda\lib\site-packages\impala\dbapi.pyc in connect(host, port, database, timeout, use_ssl, ca_cert, auth_mechanism, user, password, kerberos_service_name, use_ldap, ldap_user, ldap_password, use_kerberos, protocol)
     83                           ca_cert=ca_cert, user=user, password=password,
     84                           kerberos_service_name=kerberos_service_name,
---> 85                           auth_mechanism=auth_mechanism)
     86     return hs2.HiveServer2Connection(service, default_db=database)
     87 

C:\Anaconda\lib\site-packages\impala\hiveserver2.pyc in connect(host, port, timeout, use_ssl, ca_cert, user, password, kerberos_service_name, auth_mechanism)
    566     transport = get_transport(sock, host, kerberos_service_name,
    567                               auth_mechanism, user, password)
--> 568     transport.open()
    569     protocol = TBinaryProtocol(transport)
    570     if six.PY2:

C:\Anaconda\lib\site-packages\thrift_sasl\__init__.pyc in open(self)
     70     if not ret:
     71       raise TTransportException(type=TTransportException.NOT_OPEN,
---> 72         message=("Could not start SASL: %s" % self.sasl.getError()))
     73 
     74     # Send initial response

TTransportException: Could not start SASL: Error in sasl_client_start (-4) SASL(-4): no mechanism available: Unable to find a callback: 2

Well I guess there is some issue with some dependencies. I would like to know what are the dependencies for the GSSAPI, Sasl, thirft_sasl.

wesm commented 8 years ago

@richban Can you try using this slightly forked version of python-sasl (which uses Cython instead of SWIG for wrapping): https://github.com/cloudera/python-sasl

zoltan-fedor commented 8 years ago

I have tried the forked version of pythons-sasl from https://github.com/cloudera/python-sasl and it did not make any difference.

wesm commented 8 years ago

Confirming that is version 0.2.1? This repository is now the official one and no longer a fork.

zoltan-fedor commented 8 years ago

Yes, it is version 0.2.1 The two being the same - now I understand why I didn't see much difference when installing from the official pypi repo and from github.

zoltan-fedor commented 8 years ago

When trying GSSAPI to connect to a secured CDH Hive instance then I get the following

/usr/lib/python3.5/site-packages/impala/dbapi.py in connect(host, port, database, timeout, use_ssl, ca_cert, auth_mechanism, user, password, kerberos_service_name, use_ldap, ldap_user, ldap_password, use_kerberos, protocol)
    145                           ca_cert=ca_cert, user=user, password=password,
    146                           kerberos_service_name=kerberos_service_name,
--> 147                           auth_mechanism=auth_mechanism)
    148     return hs2.HiveServer2Connection(service, default_db=database)
    149 

/usr/lib/python3.5/site-packages/impala/hiveserver2.py in connect(host, port, timeout, use_ssl, ca_cert, user, password, kerberos_service_name, auth_mechanism)
    636     transport = get_transport(sock, host, kerberos_service_name,
    637                               auth_mechanism, user, password)
--> 638     transport.open()
    639     protocol = TBinaryProtocol(transport)
    640     if six.PY2:

/usr/lib/python3.5/site-packages/thrift_sasl/__init__.py in open(self)
     70     if not ret:
     71       raise TTransportException(type=TTransportException.NOT_OPEN,
---> 72         message=("Could not start SASL: %s" % self.sasl.getError()))
     73 
     74     # Send initial response

TTransportException: TTransportException(message="Could not start SASL: b'Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure.  Minor code may provide more information (Server not found in Kerberos database)'", type=1)

From beeline it is working fine.

Mowd commented 8 years ago

Same problem here, with sasl 0.2.1 installed.

connection = impala.dbapi.connect(host='localhost', user="xxx", password="xxx", auth_mechanism="LDAP")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.6/site-packages/impyla-0.11.2-py2.6.egg/impala/dbapi.py", line 82, in connect
    auth_mechanism=auth_mechanism)
  File "/usr/lib/python2.6/site-packages/impyla-0.11.2-py2.6.egg/impala/hiveserver2.py", line 586, in connect
    transport.open()
  File "build/bdist.linux-x86_64/egg/thrift_sasl/__init__.py", line 72, in open
thrift.transport.TTransport.TTransportException: Could not start SASL: Error in sasl_client_start (-4) SASL(-4): no mechanism available: No worthy mechs found
$ rpm -qa|grep sasl
cyrus-sasl-gssapi-2.1.23-13.el6_3.1.x86_64
cyrus-sasl-plain-2.1.23-13.el6_3.1.x86_64
cyrus-sasl-md5-2.1.23-13.el6_3.1.x86_64
cyrus-sasl-lib-2.1.23-13.el6_3.1.x86_64
cyrus-sasl-devel-2.1.23-13.el6_3.1.x86_64
cyrus-sasl-2.1.23-13.el6_3.1.x86_64
$ impala-shell -l
Starting Impala Shell using LDAP-based authentication
LDAP password for xxx:
Connected to localhost:21000
Server version: impalad version 2.2.0-cdh5.4.1 RELEASE (build 1c58386b1f2d2118f14af45f893c693a6cb10e4d)
Welcome to the Impala shell. Press TAB twice to see a list of available commands.

Copyright (c) 2012 Cloudera, Inc. All rights reserved.

(Shell build version: Impala Shell v2.2.0-cdh5.4.1 (1c58386) built on Thu May  7 22:46:35 PDT 2015)
[localhost:21000] >
wesm commented 8 years ago

Can you try using LDAP instead of GSSAPI? I am working in getting access to a secure cluster to work on this myself.

zoltan-fedor commented 8 years ago

I did try LDAP too and it was the exact same.

On Fri, Feb 26, 2016 at 6:27 PM, Wes McKinney notifications@github.com wrote:

Can you try using LDAP instead of GSSAPI? I am working in getting access to a secure cluster to work on this myself.

— Reply to this email directly or view it on GitHub https://github.com/cloudera/impyla/issues/149#issuecomment-189524755.

Mowd commented 8 years ago

There's a workaround for LDAP connection problem. You need to do yum install saslwrapper and saslwrapper-devel, then connect to impala with PLAIN mechanism.

impala.dbapi.connect(host='localhost', user="xxx", password="xxx", auth_mechanism="PLAIN")

If you don't install saslwrapper, you will get a timeout error.

wesm commented 8 years ago

I do not have the time to investigate this further, but if any solutions are found I will be happy to review pull requests.

pyite1 commented 8 years ago

similar to @Mowd i have confirmed that i am able to connect to impala-shell, and have attempted to use 'LDAP' as the auth mechanism, but the error remains

robnester commented 8 years ago

Just a note, I believe that the pertinent issue here is the use of Anaconda. The install of the cyrus packages, which would normally provide the necessities for GSSAPI, is providing bindings for the system python, not the Anaconda python. I say this as someone who is facing a similar challenge with another package requiring sasl running under Anaconda.

wesm commented 8 years ago

@robnester is the problem related to OpenSSL or something else related to Anaconda? We should escalate this to the Anaconda team at Continuum if here is something particular "wrong"

pyite1 commented 8 years ago

I have this working now on a CentOS install of Anaconda. I think that the critical aspect in getting this functioning was installing several sasl related packages on CentOS itself.

cyrus-sasl.x86_64        
cyrus-sasl-devel.x86_64  
cyrus-sasl-gs2.x86_64    
cyrus-sasl-gssapi.x86_64 
cyrus-sasl-ldap.x86_64   
cyrus-sasl-lib.x86_64    
cyrus-sasl-md5.x86_64    
cyrus-sasl-ntlm.x86_64   
cyrus-sasl-plain.x86_64  
cyrus-sasl-scram.x86_64  
cyrus-sasl-sql.x86_64    
python-saslwrapper.x86_64
ruby-saslwrapper.x86_64  
saslwrapper.x86_64       
saslwrapper-devel.x86_64 
$conda list 
impyla                    0.13.8                    <pip>
thrift                    0.9.3                     <pip>
thrift-sasl               0.2.0                     <pip>
thriftpy                  0.3.8                     <pip>
sasl                      0.2.1                     <pip>

I was unable to compile SASL on Windows (or any of the CentOS) packages, so leveraged the CentOS VM to connect to the Kerberos cluster.

After successfully running a kinit, I am able to connect with the following string:

con = impala.connect(host='xxxxxxxxxxxxx', port=21050, auth_mechanism='GSSAPI',
               kerberos_service_name='impala')

LDAP conection works as well:

con = impala.connect(host='xxxxxx', 
                             port=21050,
                             database='xxxxxx',
                             timeout=20,
                             use_ssl=True,
                             ca_cert='/home/user/cert.pem',
                             user='xxxxx', password=pwd,
                             auth_mechanism='PLAIN')
wesm commented 8 years ago

@pyite1 yes, this is helpful; we should probably preserve these instructions in documentation format

wesm commented 8 years ago

Sounds like we should try to get conda-forge builds of cyrus-sasl possibly (I'm not sure what folks' general thoughts on handling security libraries). I'm not sure when I'll have time to work on it, though

mariusvniekerk commented 8 years ago

i have some simple ones i use. dont think my current recipe links to the openssl packaged with conda. For conda-forge thats probably preferred.

Also windows is probably out of the picture

rouseguy commented 8 years ago

I am trying to use impyla to connect to Hive.

Windows Server 2008 R2. Anaconda 3.5.

from impala.dbapi import connect
#Connection string    
conn = connect(host='host', 
                    port=10000, 
                    auth_mechanism='PLAIN', 
                    user='username', 
                    password='password',
                    database='dbname') 

Running the above code returns the following error:

TTransportException: TTransportException(type=1, message="Could not start SASL: b'Error in sasl_client_start (-4) SASL(-4): no mechanism available: Unable to find a callback: 2'")

Is there a fix for Windows ?

bulmanp commented 7 years ago

@rouseguy did you get this working? i'm facing the same issue on windows

rouseguy commented 7 years ago

@bulmanp Nope. Did you try installing Visual Studio 2015 ? What is your Windows version ? There's a SASL wheel for recent versions of Windows here: http://www.lfd.uci.edu/~gohlke/pythonlibs/

bulmanp commented 7 years ago

@rouseguy yes I had VS2015 running on Windows7. I couldn't get the sasl libs from the link you sent working. however following the steps here http://java2developer.blogspot.co.uk/2016/08/making-impala-connection-from-python-on.html I was able to build the sasl lib and get the connection from windows working. thanks

gvamsi01 commented 7 years ago

Even I'm encountered with the same error though I have all the dependencies(sasl and thrift_sasl) installed.I'm trying to fetch data from my hive meta store (hive server2) using some SQL queries through python in developing a real time application.

I think there is some problem with anaconda team.

I'm windows 7 version and anaconda python 2.7 with sasl version 0.2.1. Did any one try running the same on python which is downloaded from python.org and with some IDE ?

Kindly let us know if anyone has solved this issue. Thanks in advance !!

vasanth11 commented 7 years ago

Facing a similar issue in RHEL.

File "build/bdist.linux-x86_64/egg/thrift_sasl/init.py", line 72, in open thrift.transport.TTransport.TTransportException: Could not start SASL: Error in sasl_client_start (-4) SASL(-4): no mechanism available: No worthy mechs found

exit();

rpm -qa --last | grep cyrus-sasl cyrus-sasl-gssapi-2.1.23-15.el6_6.2.x86_64
cyrus-sasl-devel-2.1.23-15.el6_6.2.x86_64
cyrus-sasl-plain-2.1.23-15.el6_6.2.x86_64
cyrus-sasl-2.1.23-15.el6_6.2.x86_64
cyrus-sasl-lib-2.1.23-15.el6_6.2.x86_64

It worked for me couple of weeks back. After i uninstalled my python 2.7 re-installed python and impyla, I'm Facing this issue. After every installation do i need to re-install sasl?

Purna17 commented 7 years ago

Hi,

I'm facing same issue connectivity to impala using impyla module and LDAP authentication.

In impyla LDAP connection string:

conn = connect(host='hdfsnode', port=21050, use_ldap=True, database='default', user='myuser', password='mypassword', auth_mechanism='PLAIN')

dependencies already avaliable: [root@hdfsnode ~]$rpm -qa | grep cyrus* cyrus-sasl-2.1.23-15.el6_6.2.x86_64 cyrus-sasl-ntlm-2.1.23-15.el6_6.2.x86_64 cyrus-sasl-lib-2.1.23-15.el6_6.2.x86_64 cyrus-sasl-plain-2.1.23-15.el6_6.2.x86_64 cyrus-sasl-devel-2.1.23-15.el6_6.2.x86_64 cyrus-sasl-md5-2.1.23-15.el6_6.2.x86_64 cyrus-sasl-sql-2.1.23-15.el6_6.2.x86_64 cyrus-sasl-gssapi-2.1.23-15.el6_6.2.x86_64 cyrus-sasl-ldap-2.1.23-15.el6_6.2.x86_64

Impyla dependencies deployed:

Error:

[root@hdfsnode~]$python impconn.py /usr/lib/python2.6/site-packages/impyla-v0.14.0-py2.6.egg/impala/util.py:159: Warning: use_ldap functionality in impyla is now deprecated and will be removed in a future release; Please use auth_mechanism="LDAP" instead. warnings.warn(msg, Warning) Traceback (most recent call last): File "impconn.py", line 6, in conn = connect(host='hdfs600.host.mobistar.be', port=21050, use_ldap=True, database='default', user='u5068168', password='Sun@14', auth_mechanism='PLAIN') File "/usr/lib/python2.6/site-packages/impyla-v0.14.0-py2.6.egg/impala/dbapi.py", line 147, in connect auth_mechanism=auth_mechanism) File "/usr/lib/python2.6/site-packages/impyla-v0.14.0-py2.6.egg/impala/hiveserver2.py", line 758, in connect transport.open() File "/usr/lib/python2.6/site-packages/thrift_sasl-0.2.1-py2.6.egg/thrift_sasl/init.py", line 72, in open message=("Could not start SASL: %s" % self.sasl.getError())) thrift.transport.TTransport.TTransportException: Could not start SASL: Error in sasl_client_start (-4) SASL(-4): no mechanism available: No worthy mechs found

Using Kerberos conntivity to impala:

conn = connect(host='hdfsnode', port=21050, use_kerberos=True, kerberos_service_name='impala')

Error: /usr/lib/python2.6/site-packages/impyla-v0.14.0-py2.6.egg/impala/util.py:159: Warning: use_kerberos functionality in impyla is now deprecated and will be removed in a future release; Please use auth_mechanism="GSSAPI" instead. warnings.warn(msg, Warning) Traceback (most recent call last): File "impconn.py", line 9, in conn = connect(host='hdfs600.host.mobistar.be', port=21050, use_kerberos=True, kerberos_service_name='impala') File "/usr/lib/python2.6/site-packages/impyla-v0.14.0-py2.6.egg/impala/dbapi.py", line 147, in connect auth_mechanism=auth_mechanism) File "/usr/lib/python2.6/site-packages/impyla-v0.14.0-py2.6.egg/impala/hiveserver2.py", line 758, in connect transport.open() File "/usr/lib/python2.6/site-packages/thrift_sasl-0.2.1-py2.6.egg/thrift_sasl/init.py", line 80, in open status, payload = self._recv_sasl_message() File "/usr/lib/python2.6/site-packages/thrift_sasl-0.2.1-py2.6.egg/thrift_sasl/init.py", line 98, in _recv_sasl_message header = read_all_compat(self._trans, 5) File "/usr/lib/python2.6/site-packages/thrift_sasl-0.2.1-py2.6.egg/thrift_sasl/six.py", line 31, in read_all_compat = lambda trans, sz: trans.readAll(sz) File "/usr/lib64/python2.6/site-packages/thrift-0.9.1-py2.6-linux-x86_64.egg/thrift/transport/TTransport.py", line 58, in readAll chunk = self.read(sz - have) File "/usr/lib64/python2.6/site-packages/thrift-0.9.1-py2.6-linux-x86_64.egg/thrift/transport/TSocket.py", line 118, in read message='TSocket read 0 bytes') thrift.transport.TTransport.TTransportException: TSocket read 0 bytes

Could you please help me here, what i missed here.

Thanks, Purna.

Purna17 commented 7 years ago

Hi Team,

Can some one help here.

Thanks in-advance.

Regards,, Purna.

mariusvniekerk commented 7 years ago

So the easiest (although less performant) solution is to uninstall the Sasl python library and install puresasl and kerberos.

Puresasl doesn't require any changes deps for plain Auth and impyla will fall back to using it.

pyite1 commented 7 years ago

@mariusvniekerk

if using anaconda, could you post a conda list?

mariusvniekerk commented 7 years ago

Pretty sure this should suffice

conda install -c conda-forge impyla puresasl thrift_sasl kerberos

you probably need something else for ldap.

Purna17 commented 7 years ago

Hi mariusvniekerk,

Thanks for the help.

I don't have anaconda libraries deploy on server.

On earlier comment as you mentioned, i have already deployed pure sasl libraries below is the deployed libraries on server, but no luck.

May i know what package exactly need to remove before deploying puresasl.

[Mynode ~]$ls -ltr /usr/lib64/python2.6/site-packages/sasl -rw-r--r-- 1 root root 5246 Apr 3 2012 /usr/lib64/python2.6/site-packages/saslwrapper.py -rwxr-xr-x 1 root root 55448 Apr 3 2012 /usr/lib64/python2.6/site-packages/_saslwrapper.so -rw-r--r-- 2 root root 9513 Apr 3 2012 /usr/lib64/python2.6/site-packages/saslwrapper.pyo -rw-r--r-- 2 root root 9513 Apr 3 2012 /usr/lib64/python2.6/site-packages/saslwrapper.pyc

/usr/lib64/python2.6/site-packages/sasl: total 552 -rw-r--r-- 1 root root 3219 Feb 26 2016 saslwrapper.pyx -rw-r--r-- 1 root root 14903 Feb 26 2016 saslwrapper.h -rw-r--r-- 1 root root 169956 Feb 26 2016 saslwrapper.cpp -rw-r--r-- 1 root root 609 Feb 26 2016 init.py -rwxr-xr-x 1 root root 360859 Feb 26 2016 saslwrapper.so -rw-r--r-- 1 root root 185 Feb 26 2016 init.pyc

/usr/lib64/python2.6/site-packages/sasl-0.2.1-py2.6.egg-info: total 24 -rw-r--r-- 1 root root 3 Feb 26 2016 requires.txt -rw-r--r-- 1 root root 248 Feb 26 2016 PKG-INFO -rw-r--r-- 1 root root 5 Feb 26 2016 top_level.txt -rw-r--r-- 1 root root 1 Feb 26 2016 dependency_links.txt -rw-r--r-- 1 root root 259 Feb 26 2016 SOURCES.txt -rw-r--r-- 1 root root 206 Feb 26 2016 installed-files.txt

/usr/lib64/python2.6/site-packages/sasl-0.2.1-py2.6-linux-x86_64.egg: total 8 drwxr-x--- 2 root root 4096 Dec 23 00:06 sasl drwxr-x--- 2 root root 4096 Dec 23 00:06 EGG-INFO [mynode~]$ls -ltr /usr/lib/python2.6/site-packages/sasl total 8 drwxr-x--- 2 root root 4096 Dec 15 16:03 EGG-INFO drwxr-x--- 2 root root 4096 Dec 15 16:03 thrift_sasl

Thank, Purna.

mariusvniekerk commented 7 years ago

I would strongly advise against modifying the system python in any way. Either use virtualenv or miniconda to make a self-contained python 3.5 installation.

charlietsai commented 6 years ago

for what it's worth, this solved it for me

apt-get install libsasl2-dev libsasl2-2 libsasl2-modules-gssapi-mit

running python in Docker

alexciobanu commented 5 years ago

I ran the following command and it fixed my issue. There might be some issues on redhat with missing dependencies

yum install cyrus-sasl-md5 cyrus-sasl-plain cyrus-sasl-gssapi cyrus-sasl-devel -y

deddu commented 5 years ago

If it can help anyone, here's a dockerfile for python 3.7

FROM python:3.7

RUN apt-get update && apt-get install -y --no-install-recommends \
        libsasl2-dev \
        libsasl2-2\ 
        libsasl2-modules-gssapi-mit\
    && rm -rf /var/lib/apt/lists/*

WORKDIR /usr/src/app

COPY requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

CMD [ "python", "./cluster.py" ]
raju-kale commented 4 years ago

Setting up following environment variable worked for me: SASL_PATH=/usr/lib/x86_64-linux-gnu/sasl2 (this on Ubuntu).

tbennett6421 commented 4 years ago

@alexciobanu is the best. This worked with RHEL7

sailist commented 3 years ago

try this version

pip3 install PyHive sasl==0.2.1 thrift==0.10.0 thrift-sasl==0.3.0
xxf09th commented 3 years ago

for what it's worth, this solved it for me

apt-get install libsasl2-dev libsasl2-2 libsasl2-modules-gssapi-mit

running python in Docker

It works.