cloudera / impyla

Python DB API 2.0 client for Impala and Hive (HiveServer2 protocol)
Apache License 2.0
728 stars 248 forks source link

AttributeError: 'module' object has no attribute 'ensure_binary' #398

Open fletchjeff opened 4 years ago

fletchjeff commented 4 years ago

When I try connecting to an Impala host on an EDH cluster, I'm getting the following error:

AttributeError:` 'module' object has no attribute 'ensure_binary'
AttributeErrorTraceback (most recent call last)
in engine
----> 1 conn = connect(host=IMPALA_HOST, port=21050, auth_mechanism='GSSAPI')

/home/cdsw/.local/lib/python2.7/site-packages/impala/dbapi.pyc in connect(host, port, database, timeout, use_ssl, ca_cert, auth_mechanism, user, password, kerberos_service_name, use_ldap, ldap_user, ldap_password, use_kerberos, protocol, krb_host, use_http_transport, http_path)
    148                           auth_mechanism=auth_mechanism, krb_host=krb_host,
    149                           use_http_transport=use_http_transport,
--> 150                           http_path=http_path)
    151     return hs2.HiveServer2Connection(service, default_db=database)
    152 

/home/cdsw/.local/lib/python2.7/site-packages/impala/hiveserver2.pyc in connect(host, port, timeout, use_ssl, ca_cert, user, password, kerberos_service_name, auth_mechanism, krb_host, use_http_transport, http_path)
    823                                 auth_mechanism, user, password)
    824 
--> 825     transport.open()
    826     protocol = TBinaryProtocol(transport)
    827     if six.PY2:

/home/cdsw/.local/lib/python2.7/site-packages/thrift_sasl/__init__.pyc in open(self)
     86 
     87     # Send initial response
---> 88     self._send_message(self.START, chosen_mech)
     89     self._send_message(self.OK, initial_response)
     90 

/home/cdsw/.local/lib/python2.7/site-packages/thrift_sasl/__init__.pyc in _send_message(self, status, body)
    105   def _send_message(self, status, body):
    106     header = struct.pack(">BI", status, len(body))
--> 107     body = six.ensure_binary(body)
    108     self._trans.write(header + body)
    109     self._trans.flush()

AttributeError: 'module' object has no attribute 'ensure_binary'

And the following python packages are installed:

Package                            Version    
---------------------------------- -----------
alabaster                          0.7.7      
anaconda-client                    1.4.0      
anaconda-navigator                 1.1.0      
argcomplete                        1.0.0      
astropy                            1.1.2      
Babel                              2.2.0      
backports-abc                      0.4        
backports.shutil-get-terminal-size 1.0.0      
backports.ssl-match-hostname       3.4.0.2    
beautifulsoup4                     4.4.1      
bitarray                           0.8.1      
blaze                              0.9.1      
bokeh                              0.11.1     
boto                               2.39.0     
Bottleneck                         1.0.0      
cdecimal                           2.3        
cdsw                               1.0.0      
certifi                            2019.11.28 
cffi                               1.5.2      
chardet                            3.0.4      
chest                              0.2.3      
cloudpickle                        0.1.1      
clyent                             1.2.1      
colorama                           0.3.7      
conda                              4.0.5      
conda-build                        1.20.0     
conda-env                          2.4.5      
conda-manager                      0.3.1      
configobj                          5.0.6      
cryptography                       1.3        
cycler                             0.10.0     
Cython                             0.29.13    
cytoolz                            0.7.5      
dask                               0.8.1      
datashape                          0.5.1      
decorator                          4.4.1      
dill                               0.2.4      
docutils                           0.12       
dynd                               0.7.3.dev1 
enum34                             1.1.6      
et-xmlfile                         1.0.1      
fastcache                          1.0.2      
Flask                              0.10.1     
Flask-Cors                         2.1.2      
funcsigs                           0.4        
functools32                        3.2.3.post2
futures                            3.3.0      
gevent                             1.1.0      
greenlet                           0.4.9      
grin                               1.2.1      
h5py                               2.5.0      
HeapDict                           1.0.0      
idna                               2.8        
impyla                             0.16.2     
ipaddress                          1.0.14     
ipykernel                          4.3.1      
ipython                            5.1.0      
ipython-genutils                   0.2.0      
ipywidgets                         4.1.1      
itsdangerous                       0.24       
jdcal                              1.2        
jedi                               0.9.0      
Jinja2                             2.8        
jsonschema                         2.4.0      
jupyter                            1.0.0      
jupyter-client                     4.2.2      
jupyter-console                    4.1.1      
jupyter-core                       4.1.0      
kudu-python                        1.2.0      
llvmlite                           0.9.0      
locket                             0.2.0      
lxml                               4.4.2      
MarkupSafe                         0.23       
matplotlib                         2.0.0      
mistune                            0.7.2      
mpmath                             0.19       
multipledispatch                   0.4.8      
nbconvert                          4.1.0      
nbformat                           4.0.1      
networkx                           1.11       
nltk                               3.2        
nose                               1.3.7      
notebook                           4.1.0      
numba                              0.24.0     
numexpr                            2.5        
numpy                              1.16.5     
odo                                0.4.2      
openpyxl                           2.3.2      
pandas                             0.24.2     
pandas-datareader                  0.8.0      
partd                              0.3.2      
path.py                            0.0.0      
pathlib2                           2.3.5      
patsy                              0.4.0      
pep8                               1.7.0      
pexpect                            4.7.0      
pickleshare                        0.7.5      
Pillow                             3.1.1      
pip                                19.3.1     
ply                                3.8        
prompt-toolkit                     1.0.18     
psutil                             4.1.0      
ptyprocess                         0.6.0      
py                                 1.4.31     
py4j                               0.10.8.1   
pyasn1                             0.1.9      
pycairo                            1.10.0     
pycosat                            0.6.1      
pycparser                          2.14       
pycrypto                           2.6.1      
pycurl                             7.19.5.3   
pyflakes                           1.1.0      
Pygments                           2.5.2      
pyOpenSSL                          0.15.1     
pyparsing                          2.4.5      
pytest                             2.8.5      
python-dateutil                    2.8.1      
pytz                               2019.3     
PyYAML                             3.11       
pyzmq                              15.2.0     
QtAwesome                          0.3.2      
qtconsole                          4.2.0      
QtPy                               1.0        
redis                              2.10.3     
requests                           2.22.0     
rope                               0.9.4      
sasl                               0.2.1      
scandir                            1.10.0     
scikit-image                       0.12.3     
scikit-learn                       0.17.1     
scipy                              1.2.2      
seaborn                            0.9.0      
setuptools                         41.2.0     
simplegeneric                      0.8.1      
simplejson                         3.16.0     
singledispatch                     3.4.0.3    
six                                1.14.0     
snowballstemmer                    1.2.1      
sockjs-tornado                     1.0.1      
sphinx-rtd-theme                   0.1.9      
spyder                             2.3.8      
SQLAlchemy                         1.0.12     
statsmodels                        0.6.1      
subprocess32                       3.5.4      
sympy                              1.0        
tables                             3.2.2      
terminado                          0.5        
thrift                             0.9.3      
thrift-sasl                        0.4.2      
toolz                              0.7.4      
tornado                            4.3        
traitlets                          4.3.3      
unicodecsv                         0.14.1     
urllib3                            1.25.7     
wcwidth                            0.1.7      
Werkzeug                           0.11.4     
wheel                              0.33.6     
xlrd                               0.9.4      
XlsxWriter                         0.8.4      
xlwt                               1.0.0      
dknupp commented 4 years ago

@fletchjeff -- My guess is that six has changed its API. Grr. A workaround for immediate use might be to try installing six==1.13.0.

dknupp commented 4 years ago

@fletchjeff -- actually, that's not the case. I think six 1.14.0 still supports ensure binary. I'm not sure what's happening here.

nhyurd commented 4 years ago

Yep, seeing the same thing error with the example for CDSW 1.7.x trying to connect to a secure CDH 6.x cluster:

https://docs.cloudera.com/documentation/data-science-workbench/1-7-x/topics/cdsw_import_data.html#query__impyla

The odd thing is that is looks like that attribute is there in the six module: https://six.readthedocs.io/#six.ensure_binary

And it was added in the 1.12.0 version: https://raw.githubusercontent.com/benjaminp/six/master/CHANGES

kylestahl commented 3 years ago

Any update on this? I'm running into the same problem. I've tried multiple versions of six and impyla and cannot get it to work.

t3rmin4t0r commented 3 years ago

I hit the same issue, because of the builtin six.py

>> six.__file__
/var/lib/cdsw/python3-engine-deps/lib/python3.6/site-packages/six.py

>> import importlib
>> sys.path=["/usr/local/lib/python3.6/dist-packages/pip/_vendor/"]+sys.path
>> importlib.reload(six)

>> six.__file__
/usr/local/lib/python3.6/dist-packages/pip/_vendor/six.py
>> six.ensure_binary
<function six.ensure_binary>
VincentCroquette commented 3 years ago

Same issue here, Python scripts works when run manually but fails as a job. Thanks to @t3rmin4t0r 's hack I managed to make it work but a proper solution would be much better. Not sure if the issue is in CDSW or the Impyla library. Thanks