snowflakedb / snowflake-connector-python

Snowflake Connector for Python
https://pypi.python.org/pypi/snowflake-connector-python/
Apache License 2.0
586 stars 468 forks source link

SNOW-173284: 'fetch_pandas_all' results in error 'pyarrow package is missing' #336

Closed leifericf closed 4 years ago

leifericf commented 4 years ago

Python version: 3.7.6

Operating system and processor architecture: Darwin-19.4.0-x86_64-i386-64bit

Component versions in the environment:

Package                               Version
------------------------------------- -----------------
adal                                  1.2.3
alabaster                             0.7.12
anaconda-client                       1.7.2
anaconda-navigator                    1.9.12
anaconda-project                      0.8.3
antlr4-python3-runtime                4.7.2
applaunchservices                     0.2.1
applicationinsights                   0.11.9
appnope                               0.1.0
appscript                             1.0.1
argcomplete                           1.11.1
argh                                  0.26.2
artifacts-keyring                     0.2.9
asn1crypto                            1.3.0
astroid                               2.3.3
astropy                               4.0
atomicwrites                          1.3.0
attrs                                 19.3.0
autopep8                              1.4.4
azure-batch                           9.0.0
azure-cli                             2.5.1
azure-cli-command-modules-nspkg       2.0.3
azure-cli-core                        2.5.1
azure-cli-nspkg                       3.0.4
azure-cli-telemetry                   1.0.4
azure-common                          1.1.25
azure-core                            1.6.0
azure-cosmos                          3.1.2
azure-datalake-store                  0.0.48
azure-functions-devops-build          0.0.22
azure-graphrbac                       0.60.0
azure-identity                        1.2.0
azure-keyvault                        1.1.0
azure-keyvault-secrets                4.1.0
azure-loganalytics                    0.1.0
azure-mgmt-advisor                    2.0.1
azure-mgmt-apimanagement              0.1.0
azure-mgmt-appconfiguration           0.4.0
azure-mgmt-applicationinsights        0.1.1
azure-mgmt-authorization              0.52.0
azure-mgmt-batch                      7.0.0
azure-mgmt-batchai                    2.0.0
azure-mgmt-billing                    0.2.0
azure-mgmt-botservice                 0.2.0
azure-mgmt-cdn                        4.1.0rc1
azure-mgmt-cognitiveservices          5.0.0
azure-mgmt-compute                    12.0.0
azure-mgmt-consumption                2.0.0
azure-mgmt-containerinstance          1.5.0
azure-mgmt-containerregistry          3.0.0rc13
azure-mgmt-containerservice           9.0.1
azure-mgmt-core                       1.0.0
azure-mgmt-cosmosdb                   0.13.0
azure-mgmt-datalake-analytics         0.2.1
azure-mgmt-datalake-nspkg             3.0.1
azure-mgmt-datalake-store             0.5.0
azure-mgmt-datamigration              0.1.0
azure-mgmt-deploymentmanager          0.2.0
azure-mgmt-devtestlabs                2.2.0
azure-mgmt-dns                        2.1.0
azure-mgmt-eventgrid                  2.2.0
azure-mgmt-eventhub                   3.0.0
azure-mgmt-hdinsight                  1.4.0
azure-mgmt-imagebuilder               0.2.1
azure-mgmt-iotcentral                 3.0.0
azure-mgmt-iothub                     0.11.0
azure-mgmt-iothubprovisioningservices 0.2.0
azure-mgmt-keyvault                   2.2.0
azure-mgmt-kusto                      0.3.0
azure-mgmt-loganalytics               0.5.0
azure-mgmt-managedservices            1.0.0
azure-mgmt-managementgroups           0.2.0
azure-mgmt-maps                       0.1.0
azure-mgmt-marketplaceordering        0.2.1
azure-mgmt-media                      1.1.1
azure-mgmt-monitor                    0.9.0
azure-mgmt-msi                        0.2.0
azure-mgmt-netapp                     0.8.0
azure-mgmt-network                    10.1.0
azure-mgmt-nspkg                      3.0.2
azure-mgmt-policyinsights             0.4.0
azure-mgmt-privatedns                 0.1.0
azure-mgmt-rdbms                      2.2.0
azure-mgmt-recoveryservices           0.4.0
azure-mgmt-recoveryservicesbackup     0.6.0
azure-mgmt-redhatopenshift            0.1.0
azure-mgmt-redis                      7.0.0rc1
azure-mgmt-relay                      0.1.0
azure-mgmt-reservations               0.6.0
azure-mgmt-resource                   9.0.0
azure-mgmt-search                     2.1.0
azure-mgmt-security                   0.1.0
azure-mgmt-servicebus                 0.6.0
azure-mgmt-servicefabric              0.4.0
azure-mgmt-signalr                    0.3.0
azure-mgmt-sql                        0.18.0
azure-mgmt-sqlvirtualmachine          0.5.0
azure-mgmt-storage                    9.0.0
azure-mgmt-trafficmanager             0.51.0
azure-mgmt-web                        0.44.0
azure-multiapi-storage                0.3.2
azure-nspkg                           3.0.2
azure-storage-blob                    1.5.0
azure-storage-common                  1.4.2
azureml-automl-core                   1.5.0.post2
azureml-core                          1.5.0.post4
azureml-dataprep                      1.5.0
azureml-dataprep-native               14.2.0
azureml-pipeline                      1.5.0
azureml-pipeline-core                 1.5.0
azureml-pipeline-steps                1.5.0
azureml-sdk                           1.5.0
azureml-telemetry                     1.5.0
azureml-train                         1.5.0
azureml-train-automl-client           1.5.0.post1
azureml-train-core                    1.5.0
azureml-train-restclients-hyperdrive  1.5.0
Babel                                 2.8.0
backcall                              0.1.0
backports.functools-lru-cache         1.6.1
backports.shutil-get-terminal-size    1.0.0
backports.tempfile                    1.0
backports.weakref                     1.0.post1
bcrypt                                3.1.7
beautifulsoup4                        4.8.2
bitarray                              1.2.1
bkcharts                              0.2
bleach                                3.1.0
bokeh                                 1.4.0
boto                                  2.49.0
boto3                                 1.11.17
botocore                              1.14.17
Bottleneck                            1.3.2
Brotli                                1.0.7
certifi                               2019.11.28
cffi                                  1.13.2
chardet                               3.0.4
Click                                 7.0
cloudpickle                           1.3.0
clyent                                1.2.2
colorama                              0.4.3
conda                                 4.8.2
conda-build                           3.18.11
conda-package-handling                1.6.0
conda-verify                          3.4.2
contextlib2                           0.6.0.post1
cryptography                          2.8
cycler                                0.10.0
Cython                                0.29.15
cytoolz                               0.10.1
dash                                  1.12.0
dash-core-components                  1.10.0
dash-html-components                  1.0.3
dash-renderer                         1.4.1
dash-table                            4.7.0
dask                                  2.11.0
databricks-cli                        0.11.0
decorator                             4.4.1
defusedxml                            0.6.0
diff-match-patch                      20181111
distributed                           2.11.0
distro                                1.5.0
docker                                4.2.1
docutils                              0.15.2
dotnetcore2                           2.1.14
entrypoints                           0.3
et-xmlfile                            1.0.1
fabric                                2.5.0
fastcache                             1.1.0
filelock                              3.0.12
flake8                                3.7.9
Flask                                 1.1.1
Flask-Compress                        1.5.0
fsspec                                0.6.2
fusepy                                3.0.1
future                                0.18.2
gevent                                1.4.0
glob2                                 0.7
gmpy2                                 2.0.8
greenlet                              0.4.15
h5py                                  2.10.0
HeapDict                              1.0.1
html5lib                              1.0.1
humanfriendly                         8.2
hypothesis                            5.5.4
idna                                  2.8
ijson                                 2.6.1
imageio                               2.6.1
imagesize                             1.2.0
importlib-metadata                    1.5.0
intervaltree                          3.0.2
invoke                                1.4.1
ipykernel                             5.1.4
ipython                               7.12.0
ipython-genutils                      0.2.0
ipywidgets                            7.5.1
isodate                               0.6.0
isort                                 4.3.21
itsdangerous                          1.1.0
javaproperties                        0.5.1
jdcal                                 1.4.1
jedi                                  0.14.1
jeepney                               0.4.3
Jinja2                                2.11.1
jmespath                              0.10.0
joblib                                0.14.1
jsmin                                 2.2.2
json5                                 0.9.1
jsondiff                              1.2.0
jsonpickle                            1.4.1
jsonschema                            3.2.0
jupyter                               1.0.0
jupyter-client                        5.3.4
jupyter-console                       6.1.0
jupyter-core                          4.6.1
jupyterlab                            1.2.6
jupyterlab-server                     1.0.6
keyring                               21.1.0
kiwisolver                            1.1.0
knack                                 0.7.0rc4
lazy-object-proxy                     1.4.3
libarchive-c                          2.8
lief                                  0.9.0
llvmlite                              0.31.0
locket                                0.2.0
lxml                                  4.5.0
MarkupSafe                            1.1.1
matplotlib                            3.1.3
mccabe                                0.6.1
mistune                               0.8.4
mkl-fft                               1.0.15
mkl-random                            1.1.0
mkl-service                           2.3.0
mock                                  4.0.1
more-itertools                        8.2.0
mpmath                                1.1.0
msal                                  1.0.0
msal-extensions                       0.1.3
msgpack                               0.6.1
msrest                                0.6.14
msrestazure                           0.6.3
multipledispatch                      0.6.0
navigator-updater                     0.2.1
nbconvert                             5.6.1
nbformat                              5.0.4
ndg-httpsclient                       0.5.1
networkx                              2.4
nltk                                  3.4.5
nose                                  1.3.7
notebook                              6.0.3
numba                                 0.48.0
numexpr                               2.7.1
numpy                                 1.18.1
numpydoc                              0.9.2
oauthlib                              3.1.0
olefile                               0.46
openpyxl                              3.0.3
oscrypto                              1.2.0
packaging                             20.1
pandas                                1.0.1
pandocfilters                         1.4.2
paramiko                              2.7.1
parso                                 0.5.2
partd                                 1.1.0
path                                  13.1.0
pathlib2                              2.3.5
pathspec                              0.8.0
pathtools                             0.1.2
patsy                                 0.5.1
pep8                                  1.7.1
pexpect                               4.8.0
pickleshare                           0.7.5
Pillow                                7.0.0
pip                                   20.1.1
pkginfo                               1.5.0.1
plotly                                4.8.1
pluggy                                0.13.1
ply                                   3.11
portalocker                           1.7.0
prometheus-client                     0.7.1
prompt-toolkit                        3.0.3
psutil                                5.6.7
ptyprocess                            0.6.0
py                                    1.8.1
pyasn1                                0.4.8
pycodestyle                           2.5.0
pycosat                               0.6.3
pycparser                             2.19
pycrypto                              2.6.1
pycryptodomex                         3.9.7
pycurl                                7.43.0.5
pydocstyle                            4.0.1
pyflakes                              2.1.1
Pygments                              2.5.2
PyJWT                                 1.7.1
pylint                                2.4.4
PyNaCl                                1.4.0
pyodbc                                4.0.0-unsupported
pyOpenSSL                             19.1.0
pyparsing                             2.4.6
pyrsistent                            0.15.7
PySocks                               1.7.1
pytest                                5.3.5
pytest-arraydiff                      0.3
pytest-astropy                        0.8.0
pytest-astropy-header                 0.1.2
pytest-doctestplus                    0.5.0
pytest-openfiles                      0.4.0
pytest-remotedata                     0.3.2
python-dateutil                       2.8.1
python-jsonrpc-server                 0.3.4
python-language-server                0.31.7
pytz                                  2019.1
PyWavelets                            1.1.1
PyYAML                                5.3
pyzmq                                 18.1.1
QDarkStyle                            2.8
QtAwesome                             0.6.1
qtconsole                             4.6.0
QtPy                                  1.9.0
requests                              2.22.0
requests-oauthlib                     1.3.0
retrying                              1.3.3
rope                                  0.16.0
Rtree                                 0.9.3
ruamel-yaml                           0.15.87
ruamel.yaml                           0.16.10
ruamel.yaml.clib                      0.2.0
s3transfer                            0.3.3
scikit-image                          0.16.2
scikit-learn                          0.22.1
scipy                                 1.4.1
scp                                   0.13.2
seaborn                               0.10.0
SecretStorage                         3.1.2
Send2Trash                            1.5.0
setuptools                            47.1.1
simplegeneric                         0.8.1
singledispatch                        3.4.0.3
six                                   1.14.0
snowballstemmer                       2.0.0
snowflake-connector-python            2.2.8
sortedcollections                     1.1.2
sortedcontainers                      2.1.0
soupsieve                             1.9.5
Sphinx                                2.4.0
sphinxcontrib-applehelp               1.0.1
sphinxcontrib-devhelp                 1.0.1
sphinxcontrib-htmlhelp                1.0.2
sphinxcontrib-jsmath                  1.0.1
sphinxcontrib-qthelp                  1.0.2
sphinxcontrib-serializinghtml         1.1.3
sphinxcontrib-websupport              1.2.0
spyder                                4.0.1
spyder-kernels                        1.8.1
SQLAlchemy                            1.3.13
sshtunnel                             0.1.5
statsmodels                           0.11.0
sympy                                 1.5.1
tables                                3.6.1
tabulate                              0.8.7
tblib                                 1.6.0
terminado                             0.8.3
testpath                              0.4.4
toolz                                 0.10.0
tornado                               6.0.3
tqdm                                  4.42.1
traitlets                             4.3.3
ujson                                 1.35
unicodecsv                            0.14.1
urllib3                               1.25.8
vsts                                  0.1.25
vsts-cd-manager                       1.0.2
watchdog                              0.10.2
wcwidth                               0.1.8
webencodings                          0.5.1
websocket-client                      0.56.0
Werkzeug                              1.0.0
wheel                                 0.34.2
widgetsnbextension                    3.5.1
wrapt                                 1.11.2
wurlitzer                             2.0.0
xlrd                                  1.2.0
XlsxWriter                            1.2.7
xlwings                               0.17.1
xlwt                                  1.3.0
xmltodict                             0.12.0
yapf                                  0.28.0
zict                                  1.0.0
zipp                                  2.2.0

To reproduce the error, call the fetch_pandas_all() function, like so:

def execute_query(query_string):
    cursor = snowflake.connector.connect(…).cursor()
    try:
        cursor.execute(query_string)
        result = cursor.fetch_pandas_all()
    finally:
        cursor.close()
    return result

query_string = 'select a.col1, a.col2 from my_database.my_schema.my_table as t limit 100;'

data = execute_query(query_string)

That will result in this pyarrow-related error:

Exception has occurred: ProgrammingError
255002: pyarrow package is missing. Install using pip if the platform is supported.

I would expect the snowflake-connector-python package to install its own dependencies as needed.

Note that using the fetchall() function works fine:

def execute_query(query_string):
    cursor = get_connection().cursor()
    try:
        cursor.execute(query_string)
        result = cursor.fetchall()
    finally:
        cursor.close()
    return result

The issue seems to be related to converting the SQL query result to a Pandas dataframe.

Detailed execution log for debugging:

2020-07-07 11:48:16,365 - MainThread connection.py:485 - cursor() - DEBUG - cursor
DEBUG - cursor
2020-07-07 11:48:16,365 - MainThread cursor.py:459 - execute() - DEBUG - executing SQL/command
DEBUG - executing SQL/command
2020-07-07 11:48:16,365 - MainThread cursor.py:482 - execute() - DEBUG - binding: [select a.col1, a.col2 from my_database.my_schema.my_table as t limit 1...] with input=[None], processed=[{}]
DEBUG - binding: [select a.col1, a.col2 from my_database.my_schema.my_table as t limit 1...] with input=[None], processed=[{}]
2020-07-07 11:48:16,365 - MainThread cursor.py:514 - execute() - INFO - query: [select a.col1, a.col2 from my_database.my_schema.my_table as t limit 1...]
INFO - query: [select a.col1, a.col2 from my_database.my_schema.my_table as t limit 1...]
2020-07-07 11:48:16,365 - MainThread connection.py:981 - _next_sequence_counter() - DEBUG - sequence counter: 1
DEBUG - sequence counter: 1
2020-07-07 11:48:16,366 - MainThread cursor.py:313 - _execute_helper() - DEBUG - running query [select a.col1, a.col2 from my_database.my_schema.my_table as t limit 1...]
DEBUG - running query [select a.col1, a.col2 from my_database.my_schema.my_table as t limit 1...]
2020-07-07 11:48:16,378 - MainThread cursor.py:322 - _execute_helper() - DEBUG - is_file_transfer: False
DEBUG - is_file_transfer: False
2020-07-07 11:48:16,379 - MainThread connection.py:760 - cmd_query() - DEBUG - _cmd_query
DEBUG - _cmd_query
2020-07-07 11:48:16,379 - MainThread connection.py:782 - cmd_query() - DEBUG - sql=[select a.col1, a.col2 from my_database.my_schema.my_table as t limit 1...], sequence_id=[1], is_file_transfer=[None]
DEBUG - sql=[select a.col1, a.col2 from my_database.my_schema.my_table as t limit 1...], sequence_id=[1], is_file_transfer=[None]
2020-07-07 11:48:16,379 - MainThread network.py:896 - _use_requests_session() - DEBUG - Active requests sessions: 1, idle: 0
DEBUG - Active requests sessions: 1, idle: 0
2020-07-07 11:48:16,379 - MainThread network.py:593 - _request_exec_wrapper() - DEBUG - remaining request timeout: None, retry cnt: 1
DEBUG - remaining request timeout: None, retry cnt: 1
2020-07-07 11:48:16,379 - MainThread network.py:738 - _request_exec() - DEBUG - socket timeout: 60
DEBUG - socket timeout: 60
2020-07-07 11:48:16,531 - MainThread network.py:768 - _request_exec() - DEBUG - SUCCESS
DEBUG - SUCCESS
2020-07-07 11:48:16,531 - MainThread network.py:909 - _use_requests_session() - DEBUG - Active requests sessions: 0, idle: 1
DEBUG - Active requests sessions: 0, idle: 1
2020-07-07 11:48:16,531 - MainThread network.py:494 - _post_request() - DEBUG - ret[code] = None, after post request
DEBUG - ret[code] = None, after post request
2020-07-07 11:48:16,532 - MainThread cursor.py:534 - execute() - DEBUG - sfqid: my_sfqid
DEBUG - sfqid: my_sfqid
2020-07-07 11:48:16,532 - MainThread cursor.py:536 - execute() - INFO - query execution done
INFO - query execution done
2020-07-07 11:48:16,532 - MainThread cursor.py:538 - execute() - DEBUG - SUCCESS
DEBUG - SUCCESS
2020-07-07 11:48:16,532 - MainThread cursor.py:542 - execute() - DEBUG - PUT OR GET: None
DEBUG - PUT OR GET: None
2020-07-07 11:48:16,532 - MainThread cursor.py:602 - _init_result_and_meta() - DEBUG - Query result format: arrow
DEBUG - Query result format: arrow
2020-07-07 11:48:16,533 - MainThread cursor.py:622 - _init_result_and_meta() - DEBUG - Batches read: 1
DEBUG - Batches read: 1
sfc-gh-mkeller commented 4 years ago

Hi @IRLeif the connector knows what dependencies to install if you tell it that you will be needing pandas (and pyarrow). We have an optional dependency group called pandas. Install the connector like this: pip install snowflake-connector-python[pandas], documentation is here: https://docs.snowflake.com/en/user-guide/python-connector-pandas.html#installation

leifericf commented 4 years ago

Hi @IRLeif the connector knows what dependencies to install if you tell it that you will be needing pandas (and pyarrow). We have an optional dependency group called pandas. Install the connector like this: pip install snowflake-connector-python[pandas], documentation is here: https://docs.snowflake.com/en/user-guide/python-connector-pandas.html#installation

Aha! It was my mistake. I was unfamiliar with the concept of optional dependency groups in general, and I had missed that part of the Snowflake documentation in particular. After adding [pandas] to my pip install command, everything is now working smoothly. Thank you for taking the time to comment.