Open policratus opened 8 years ago
What's the output of running rpm -qa | grep cyrus
?
@laserson The output is (yes, we're running the cluster on AWS):
cyrus-sasl-gssapi-2.1.23-13.16.amzn1.x86_64
cyrus-sasl-ldap-2.1.23-13.16.amzn1.x86_64
cyrus-sasl-ntlm-2.1.23-13.16.amzn1.x86_64
cyrus-sasl-2.1.23-13.16.amzn1.x86_64
cyrus-sasl-plain-2.1.23-13.16.amzn1.x86_64
cyrus-sasl-lib-2.1.23-13.16.amzn1.x86_64
cyrus-sasl-md5-2.1.23-13.16.amzn1.x86_64
cyrus-sasl-devel-2.1.23-13.16.amzn1.x86_64
cyrus-sasl-sql-2.1.23-13.16.amzn1.x86_64
I have the same issue I'm running on WIN8 I'm using Python 2.7 Anaconda distribution and our Hadoop Cluster is secured with Kerberos so we are using auth_mechanism='GSSAPI'
I installed python-sasl so I wonder if I am missing some other dependencies.
@nforte-luizalabs that's sucks, so what did you do after? Have you tried something else? I am trying to figure out how to use Spark. I would like to know if it is possible to create a client with spark to Impala.
@richban We've tried all thinkable dependencies, without success.
@policratus Can you successfully access Impala through the impala-shell
?
@richban What are you trying to accomplish? If you want to access database tables from the Hive MetaStore from Spark, you should be able to with SparkSQL.
@laserson yes I can successfully run impala-shell
. Well yes I would like to access the database tables and do some some SQL queries a put python on top of it. But I haven't figured out how to connect to the database using SparkSQL. In a simple way I want to connect to our hadoop cluster and retrieve some data (do some SQL stuff) and than use this data with python.
@laserson the main goal was to connect to Impala with Impyla client so that we can run impala sql queries and give that data to python. This is the error we encounter
conn = connect(host='host_name', port=21050, auth_mechanism='GSSAPI', database='db_name',
user='user_name', password='pswd')
---------------------------------------------------------------------------
TTransportException Traceback (most recent call last)
<ipython-input-3-1d678c3c4501> in <module>()
1 conn = connect(host='', port=21050, auth_mechanism='GSSAPI', database='',
----> 2 user='', password='')
C:\Anaconda\lib\site-packages\impala\dbapi.pyc in connect(host, port, database, timeout, use_ssl, ca_cert, auth_mechanism, user, password, kerberos_service_name, use_ldap, ldap_user, ldap_password, use_kerberos, protocol)
83 ca_cert=ca_cert, user=user, password=password,
84 kerberos_service_name=kerberos_service_name,
---> 85 auth_mechanism=auth_mechanism)
86 return hs2.HiveServer2Connection(service, default_db=database)
87
C:\Anaconda\lib\site-packages\impala\hiveserver2.pyc in connect(host, port, timeout, use_ssl, ca_cert, user, password, kerberos_service_name, auth_mechanism)
566 transport = get_transport(sock, host, kerberos_service_name,
567 auth_mechanism, user, password)
--> 568 transport.open()
569 protocol = TBinaryProtocol(transport)
570 if six.PY2:
C:\Anaconda\lib\site-packages\thrift_sasl\__init__.pyc in open(self)
70 if not ret:
71 raise TTransportException(type=TTransportException.NOT_OPEN,
---> 72 message=("Could not start SASL: %s" % self.sasl.getError()))
73
74 # Send initial response
TTransportException: Could not start SASL: Error in sasl_client_start (-4) SASL(-4): no mechanism available: Unable to find a callback: 2
Well I guess there is some issue with some dependencies. I would like to know what are the dependencies for the GSSAPI, Sasl, thirft_sasl.
@richban Can you try using this slightly forked version of python-sasl (which uses Cython instead of SWIG for wrapping): https://github.com/cloudera/python-sasl
I have tried the forked version of pythons-sasl from https://github.com/cloudera/python-sasl and it did not make any difference.
Confirming that is version 0.2.1? This repository is now the official one and no longer a fork.
Yes, it is version 0.2.1 The two being the same - now I understand why I didn't see much difference when installing from the official pypi repo and from github.
When trying GSSAPI to connect to a secured CDH Hive instance then I get the following
/usr/lib/python3.5/site-packages/impala/dbapi.py in connect(host, port, database, timeout, use_ssl, ca_cert, auth_mechanism, user, password, kerberos_service_name, use_ldap, ldap_user, ldap_password, use_kerberos, protocol)
145 ca_cert=ca_cert, user=user, password=password,
146 kerberos_service_name=kerberos_service_name,
--> 147 auth_mechanism=auth_mechanism)
148 return hs2.HiveServer2Connection(service, default_db=database)
149
/usr/lib/python3.5/site-packages/impala/hiveserver2.py in connect(host, port, timeout, use_ssl, ca_cert, user, password, kerberos_service_name, auth_mechanism)
636 transport = get_transport(sock, host, kerberos_service_name,
637 auth_mechanism, user, password)
--> 638 transport.open()
639 protocol = TBinaryProtocol(transport)
640 if six.PY2:
/usr/lib/python3.5/site-packages/thrift_sasl/__init__.py in open(self)
70 if not ret:
71 raise TTransportException(type=TTransportException.NOT_OPEN,
---> 72 message=("Could not start SASL: %s" % self.sasl.getError()))
73
74 # Send initial response
TTransportException: TTransportException(message="Could not start SASL: b'Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Server not found in Kerberos database)'", type=1)
From beeline it is working fine.
Same problem here, with sasl 0.2.1 installed.
connection = impala.dbapi.connect(host='localhost', user="xxx", password="xxx", auth_mechanism="LDAP")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.6/site-packages/impyla-0.11.2-py2.6.egg/impala/dbapi.py", line 82, in connect
auth_mechanism=auth_mechanism)
File "/usr/lib/python2.6/site-packages/impyla-0.11.2-py2.6.egg/impala/hiveserver2.py", line 586, in connect
transport.open()
File "build/bdist.linux-x86_64/egg/thrift_sasl/__init__.py", line 72, in open
thrift.transport.TTransport.TTransportException: Could not start SASL: Error in sasl_client_start (-4) SASL(-4): no mechanism available: No worthy mechs found
$ rpm -qa|grep sasl
cyrus-sasl-gssapi-2.1.23-13.el6_3.1.x86_64
cyrus-sasl-plain-2.1.23-13.el6_3.1.x86_64
cyrus-sasl-md5-2.1.23-13.el6_3.1.x86_64
cyrus-sasl-lib-2.1.23-13.el6_3.1.x86_64
cyrus-sasl-devel-2.1.23-13.el6_3.1.x86_64
cyrus-sasl-2.1.23-13.el6_3.1.x86_64
$ impala-shell -l
Starting Impala Shell using LDAP-based authentication
LDAP password for xxx:
Connected to localhost:21000
Server version: impalad version 2.2.0-cdh5.4.1 RELEASE (build 1c58386b1f2d2118f14af45f893c693a6cb10e4d)
Welcome to the Impala shell. Press TAB twice to see a list of available commands.
Copyright (c) 2012 Cloudera, Inc. All rights reserved.
(Shell build version: Impala Shell v2.2.0-cdh5.4.1 (1c58386) built on Thu May 7 22:46:35 PDT 2015)
[localhost:21000] >
Can you try using LDAP instead of GSSAPI? I am working in getting access to a secure cluster to work on this myself.
I did try LDAP too and it was the exact same.
On Fri, Feb 26, 2016 at 6:27 PM, Wes McKinney notifications@github.com wrote:
Can you try using LDAP instead of GSSAPI? I am working in getting access to a secure cluster to work on this myself.
— Reply to this email directly or view it on GitHub https://github.com/cloudera/impyla/issues/149#issuecomment-189524755.
There's a workaround for LDAP connection problem. You need to do yum install saslwrapper and saslwrapper-devel, then connect to impala with PLAIN mechanism.
impala.dbapi.connect(host='localhost', user="xxx", password="xxx", auth_mechanism="PLAIN")
If you don't install saslwrapper, you will get a timeout error.
I do not have the time to investigate this further, but if any solutions are found I will be happy to review pull requests.
similar to @Mowd i have confirmed that i am able to connect to impala-shell, and have attempted to use 'LDAP' as the auth mechanism, but the error remains
Just a note, I believe that the pertinent issue here is the use of Anaconda. The install of the cyrus packages, which would normally provide the necessities for GSSAPI, is providing bindings for the system python, not the Anaconda python. I say this as someone who is facing a similar challenge with another package requiring sasl running under Anaconda.
@robnester is the problem related to OpenSSL or something else related to Anaconda? We should escalate this to the Anaconda team at Continuum if here is something particular "wrong"
I have this working now on a CentOS install of Anaconda. I think that the critical aspect in getting this functioning was installing several sasl related packages on CentOS itself.
cyrus-sasl.x86_64
cyrus-sasl-devel.x86_64
cyrus-sasl-gs2.x86_64
cyrus-sasl-gssapi.x86_64
cyrus-sasl-ldap.x86_64
cyrus-sasl-lib.x86_64
cyrus-sasl-md5.x86_64
cyrus-sasl-ntlm.x86_64
cyrus-sasl-plain.x86_64
cyrus-sasl-scram.x86_64
cyrus-sasl-sql.x86_64
python-saslwrapper.x86_64
ruby-saslwrapper.x86_64
saslwrapper.x86_64
saslwrapper-devel.x86_64
$conda list
impyla 0.13.8 <pip>
thrift 0.9.3 <pip>
thrift-sasl 0.2.0 <pip>
thriftpy 0.3.8 <pip>
sasl 0.2.1 <pip>
I was unable to compile SASL on Windows (or any of the CentOS) packages, so leveraged the CentOS VM to connect to the Kerberos cluster.
After successfully running a kinit, I am able to connect with the following string:
con = impala.connect(host='xxxxxxxxxxxxx', port=21050, auth_mechanism='GSSAPI',
kerberos_service_name='impala')
LDAP conection works as well:
con = impala.connect(host='xxxxxx',
port=21050,
database='xxxxxx',
timeout=20,
use_ssl=True,
ca_cert='/home/user/cert.pem',
user='xxxxx', password=pwd,
auth_mechanism='PLAIN')
@pyite1 yes, this is helpful; we should probably preserve these instructions in documentation format
Sounds like we should try to get conda-forge builds of cyrus-sasl possibly (I'm not sure what folks' general thoughts on handling security libraries). I'm not sure when I'll have time to work on it, though
i have some simple ones i use. dont think my current recipe links to the openssl packaged with conda. For conda-forge thats probably preferred.
Also windows is probably out of the picture
I am trying to use impyla
to connect to Hive.
Windows Server 2008 R2. Anaconda 3.5.
from impala.dbapi import connect
#Connection string
conn = connect(host='host',
port=10000,
auth_mechanism='PLAIN',
user='username',
password='password',
database='dbname')
Running the above code returns the following error:
TTransportException: TTransportException(type=1, message="Could not start SASL: b'Error in sasl_client_start (-4) SASL(-4): no mechanism available: Unable to find a callback: 2'")
Is there a fix for Windows ?
@rouseguy did you get this working? i'm facing the same issue on windows
@bulmanp Nope. Did you try installing Visual Studio 2015 ? What is your Windows version ? There's a SASL wheel for recent versions of Windows here: http://www.lfd.uci.edu/~gohlke/pythonlibs/
@rouseguy yes I had VS2015 running on Windows7. I couldn't get the sasl libs from the link you sent working. however following the steps here http://java2developer.blogspot.co.uk/2016/08/making-impala-connection-from-python-on.html I was able to build the sasl lib and get the connection from windows working. thanks
Even I'm encountered with the same error though I have all the dependencies(sasl and thrift_sasl) installed.I'm trying to fetch data from my hive meta store (hive server2) using some SQL queries through python in developing a real time application.
I think there is some problem with anaconda team.
I'm windows 7 version and anaconda python 2.7 with sasl version 0.2.1. Did any one try running the same on python which is downloaded from python.org and with some IDE ?
Kindly let us know if anyone has solved this issue. Thanks in advance !!
Facing a similar issue in RHEL.
File "build/bdist.linux-x86_64/egg/thrift_sasl/init.py", line 72, in open thrift.transport.TTransport.TTransportException: Could not start SASL: Error in sasl_client_start (-4) SASL(-4): no mechanism available: No worthy mechs found
exit();
rpm -qa --last | grep cyrus-sasl
cyrus-sasl-gssapi-2.1.23-15.el6_6.2.x86_64
cyrus-sasl-devel-2.1.23-15.el6_6.2.x86_64
cyrus-sasl-plain-2.1.23-15.el6_6.2.x86_64
cyrus-sasl-2.1.23-15.el6_6.2.x86_64
cyrus-sasl-lib-2.1.23-15.el6_6.2.x86_64
It worked for me couple of weeks back. After i uninstalled my python 2.7 re-installed python and impyla, I'm Facing this issue. After every installation do i need to re-install sasl?
Hi,
I'm facing same issue connectivity to impala using impyla module and LDAP authentication.
In impyla LDAP connection string:
conn = connect(host='hdfsnode', port=21050, use_ldap=True, database='default', user='myuser', password='mypassword', auth_mechanism='PLAIN')
dependencies already avaliable: [root@hdfsnode ~]$rpm -qa | grep cyrus* cyrus-sasl-2.1.23-15.el6_6.2.x86_64 cyrus-sasl-ntlm-2.1.23-15.el6_6.2.x86_64 cyrus-sasl-lib-2.1.23-15.el6_6.2.x86_64 cyrus-sasl-plain-2.1.23-15.el6_6.2.x86_64 cyrus-sasl-devel-2.1.23-15.el6_6.2.x86_64 cyrus-sasl-md5-2.1.23-15.el6_6.2.x86_64 cyrus-sasl-sql-2.1.23-15.el6_6.2.x86_64 cyrus-sasl-gssapi-2.1.23-15.el6_6.2.x86_64 cyrus-sasl-ldap-2.1.23-15.el6_6.2.x86_64
Impyla dependencies deployed:
[root@hdfsnode~]$python impconn.py
/usr/lib/python2.6/site-packages/impyla-v0.14.0-py2.6.egg/impala/util.py:159: Warning: use_ldap functionality in impyla is now deprecated and will be removed in a future release; Please use auth_mechanism="LDAP" instead.
warnings.warn(msg, Warning)
Traceback (most recent call last):
File "impconn.py", line 6, in
Using Kerberos conntivity to impala:
conn = connect(host='hdfsnode', port=21050, use_kerberos=True, kerberos_service_name='impala')
Error:
/usr/lib/python2.6/site-packages/impyla-v0.14.0-py2.6.egg/impala/util.py:159: Warning: use_kerberos functionality in impyla is now deprecated and will be removed in a future release; Please use auth_mechanism="GSSAPI" instead.
warnings.warn(msg, Warning)
Traceback (most recent call last):
File "impconn.py", line 9, in
Could you please help me here, what i missed here.
Thanks, Purna.
Hi Team,
Can some one help here.
Thanks in-advance.
Regards,, Purna.
So the easiest (although less performant) solution is to uninstall the Sasl python library and install puresasl and kerberos.
Puresasl doesn't require any changes deps for plain Auth and impyla will fall back to using it.
@mariusvniekerk
if using anaconda, could you post a conda list?
Pretty sure this should suffice
conda install -c conda-forge impyla puresasl thrift_sasl kerberos
you probably need something else for ldap.
Hi mariusvniekerk,
Thanks for the help.
I don't have anaconda libraries deploy on server.
On earlier comment as you mentioned, i have already deployed pure sasl libraries below is the deployed libraries on server, but no luck.
May i know what package exactly need to remove before deploying puresasl.
[Mynode ~]$ls -ltr /usr/lib64/python2.6/site-packages/sasl -rw-r--r-- 1 root root 5246 Apr 3 2012 /usr/lib64/python2.6/site-packages/saslwrapper.py -rwxr-xr-x 1 root root 55448 Apr 3 2012 /usr/lib64/python2.6/site-packages/_saslwrapper.so -rw-r--r-- 2 root root 9513 Apr 3 2012 /usr/lib64/python2.6/site-packages/saslwrapper.pyo -rw-r--r-- 2 root root 9513 Apr 3 2012 /usr/lib64/python2.6/site-packages/saslwrapper.pyc
/usr/lib64/python2.6/site-packages/sasl: total 552 -rw-r--r-- 1 root root 3219 Feb 26 2016 saslwrapper.pyx -rw-r--r-- 1 root root 14903 Feb 26 2016 saslwrapper.h -rw-r--r-- 1 root root 169956 Feb 26 2016 saslwrapper.cpp -rw-r--r-- 1 root root 609 Feb 26 2016 init.py -rwxr-xr-x 1 root root 360859 Feb 26 2016 saslwrapper.so -rw-r--r-- 1 root root 185 Feb 26 2016 init.pyc
/usr/lib64/python2.6/site-packages/sasl-0.2.1-py2.6.egg-info: total 24 -rw-r--r-- 1 root root 3 Feb 26 2016 requires.txt -rw-r--r-- 1 root root 248 Feb 26 2016 PKG-INFO -rw-r--r-- 1 root root 5 Feb 26 2016 top_level.txt -rw-r--r-- 1 root root 1 Feb 26 2016 dependency_links.txt -rw-r--r-- 1 root root 259 Feb 26 2016 SOURCES.txt -rw-r--r-- 1 root root 206 Feb 26 2016 installed-files.txt
/usr/lib64/python2.6/site-packages/sasl-0.2.1-py2.6-linux-x86_64.egg: total 8 drwxr-x--- 2 root root 4096 Dec 23 00:06 sasl drwxr-x--- 2 root root 4096 Dec 23 00:06 EGG-INFO [mynode~]$ls -ltr /usr/lib/python2.6/site-packages/sasl total 8 drwxr-x--- 2 root root 4096 Dec 15 16:03 EGG-INFO drwxr-x--- 2 root root 4096 Dec 15 16:03 thrift_sasl
Thank, Purna.
I would strongly advise against modifying the system python in any way. Either use virtualenv or miniconda to make a self-contained python 3.5 installation.
for what it's worth, this solved it for me
apt-get install libsasl2-dev libsasl2-2 libsasl2-modules-gssapi-mit
running python in Docker
I ran the following command and it fixed my issue. There might be some issues on redhat with missing dependencies
yum install cyrus-sasl-md5 cyrus-sasl-plain cyrus-sasl-gssapi cyrus-sasl-devel -y
If it can help anyone, here's a dockerfile for python 3.7
FROM python:3.7
RUN apt-get update && apt-get install -y --no-install-recommends \
libsasl2-dev \
libsasl2-2\
libsasl2-modules-gssapi-mit\
&& rm -rf /var/lib/apt/lists/*
WORKDIR /usr/src/app
COPY requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD [ "python", "./cluster.py" ]
Setting up following environment variable worked for me: SASL_PATH=/usr/lib/x86_64-linux-gnu/sasl2 (this on Ubuntu).
@alexciobanu is the best. This worked with RHEL7
try this version
pip3 install PyHive sasl==0.2.1 thrift==0.10.0 thrift-sasl==0.3.0
for what it's worth, this solved it for me
apt-get install libsasl2-dev libsasl2-2 libsasl2-modules-gssapi-mit
running python in Docker
It works.
When trying to connect to an Impala Server, the following error happens:
Even with all LDAP dependencies installed.