cloudera / impyla

Python DB API 2.0 client for Impala and Hive (HiveServer2 protocol)
Apache License 2.0
727 stars 249 forks source link

Broken thriftpy dependency with python3 when installing v0.14.2 #335

Open gjask opened 5 years ago

gjask commented 5 years ago

Hi. With upgrade to 0.14.2 there seems to be broken thriftpy dependency. Here are some data how to reproduce the bug.

Installation

[jask@hal3042 ~]$ virtualenv -p /usr/bin/python3 venv
Already using interpreter /usr/bin/python3
Using base prefix '/usr'
New python executable in /home/jask/venv/bin/python3
Also creating executable in /home/jask/venv/bin/python
Installing setuptools, pip, wheel...done.

[jask@hal3042 ~]$ ./venv/bin/pip install impyla
Collecting impyla
Collecting thrift>=0.9.3 (from impyla)
Collecting bitarray (from impyla)
Collecting six (from impyla)
  Using cached https://files.pythonhosted.org/packages/73/fb/00a976f728d0d1fecfe898238ce23f502a721c0ac0ecfedb80e0d88c64e9/six-1.12.0-py2.py3-none-any.whl
Installing collected packages: six, thrift, bitarray, impyla
Successfully installed bitarray-0.8.3 impyla-0.14.2 six-1.12.0 thrift-0.11.0

Run

[jask@hal3042 ~]$ ./venv/bin/python
Python 3.7.2 (default, Jan 16 2019, 19:49:22) 
[GCC 8.2.1 20181215 (Red Hat 8.2.1-6)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
>>> from impala.dbapi import connect
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/jask/venv/lib/python3.7/site-packages/impala/dbapi.py", line 28, in <module>
    import impala.hiveserver2 as hs2
  File "/home/jask/venv/lib/python3.7/site-packages/impala/hiveserver2.py", line 33, in <module>
    from impala._thrift_api import (
  File "/home/jask/venv/lib/python3.7/site-packages/impala/_thrift_api.py", line 61, in <module>
    from thriftpy import load
ModuleNotFoundError: No module named 'thriftpy'
dknupp commented 5 years ago

Thanks. Working on a patch for this now. For now, python 2.x seems unaffected. For python3, running pip install thriftpy separately should fix the problem.

jirislav commented 5 years ago

I think build tests would be useful in this case as this would not happen next time.

dknupp commented 5 years ago

This has indeed been an educational experience.

dknupp commented 5 years ago

Because pip installs had been broken by 0.14.2 ( see issue #336 ), a patched release has been provided, that essentially just restores the last working version.

If 0.14.2 is already installed, an upgrade will install the new release, which simply repackages the last working version. This should hopefully unblock installation failures until this and other build problems can be addressed.

(impyla_test_venv) $ pip list | grep impyla
impyla     0.14.2 
(impyla_test_venv) $ pip install -U impyla
Collecting impyla
Requirement already satisfied, skipping upgrade: six in ./impyla_test_venv/lib/python2.7/site-packages (from impyla) (1.12.0)
Requirement already satisfied, skipping upgrade: bitarray in ./impyla_test_venv/lib/python2.7/site-packages (from impyla) (0.8.3)
Collecting thrift<=0.9.3 (from impyla)
Installing collected packages: thrift, impyla
  Found existing installation: thrift 0.11.0
    Uninstalling thrift-0.11.0:
      Successfully uninstalled thrift-0.11.0
  Found existing installation: impyla 0.14.2
    Uninstalling impyla-0.14.2:
      Successfully uninstalled impyla-0.14.2
Successfully installed impyla-0.14.2.2 thrift-0.9.3
JoshRosen commented 5 years ago

I also ran into this problem. It looks like one potential solution is to modify setup.py to conditionally install thriftpy (or thriftpy2, as suggested in #329) for Python 3.

@dknupp, it looks like you've done this in https://github.com/cloudera/impyla/compare/master...dknupp:install_thriftpy_python3. I'm not super familiar with Python "environment markers" in dependencies, so while reading up on them I came across https://github.com/inveniosoftware/troubleshooting/issues/1 which describes some potential pitfalls and corner-cases when using older setuptools versions. If you're worried about those older versions then it looks like the approach taken at https://github.com/abseil/abseil-py/commit/2b6ff1281d517df54570c68f2ff69ca9f55ff32b#diff-2eeaed663bd0d25b7e608891384b7298R43 might work.

dknupp commented 5 years ago

Alpha release 0.15a1 is now up on pypi for folks to try out. pip install impyla==0.15a1

645187919 commented 4 years ago

Alpha release 0.15a1 is now up on pypi for folks to try out. pip install impyla==0.15a1

i use pip install impyla==0.15a1 order to install,but i still get error: Traceback (most recent call last): File "hive_test.py", line 9, in conn = connect(host="192.168.112.20", port=10000, database='test',auth_mechanism="PLAIN") File "/root/anaconda3/lib/python3.6/site-packages/impala/dbapi.py", line 147, in connect auth_mechanism=auth_mechanism, krb_host=krb_host) File "/root/anaconda3/lib/python3.6/site-packages/impala/hiveserver2.py", line 778, in connect auth_mechanism, user, password) File "/root/anaconda3/lib/python3.6/site-packages/impala/_thrift_api.py", line 154, in get_transport from thrift_sasl import TSaslClientTransport File "/root/anaconda3/lib/python3.6/site-packages/thrift_sasl/init.py", line 26, in from thrift_sasl.six import ( File "/root/anaconda3/lib/python3.6/site-packages/thrift_sasl/six.py", line 45, in from thriftpy.transport import TTransportException, TTransportBase, readall ModuleNotFoundError: No module named 'thriftpy' who can help me ? my envoriment is :pyhon3.6,linux

AbdealiLoKo commented 4 years ago

I wonder if thriftpy2 is needed anymore for py3+ ? From what I see in: https://github.com/cloudera/impyla/blob/0284cc0850a1acf53507ba1366022fc36e6df517/impala/_thrift_api.py#L14-L18

It seems to indicate that it is only a compatibility layer until thrift supports py3. As per my understanding, Thrift currently does support py3, so maybe this compat layer can just be removed now ?