snowflakedb / snowflake-connector-python

Snowflake Connector for Python
https://pypi.python.org/pypi/snowflake-connector-python/
Apache License 2.0
578 stars 467 forks source link

SNOW-962394: pyarrow incompatible with pandas for 3.4.0 but no such error using 3.3.0b1 #1796

Closed matquant14 closed 10 months ago

matquant14 commented 10 months ago

Python version

Python 3.11.5 (tags/v3.11.5:cce6ba9, Aug 24 2023, 14:38:34) [MSC v.1936 64 bit (AMD64)]

Operating system and processor architecture

Windows-10-10.0.19045-SP0

Installed packages

adbc-driver-manager==0.7.0
adbc-driver-snowflake==0.7.0
adbc-driver-sqlite==0.6.0   
aiobotocore==2.5.4          
aiohttp==3.8.6              
aioitertools==0.11.0        
aiosignal==1.3.1            
annotated-types==0.5.0      
anyio==4.0.0
appdirs==1.4.4
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arrow==1.3.0
arrow-odbc==1.3.0
asn1crypto==1.5.1
asttokens==2.4.0
async-generator==1.10
async-lru==2.0.4
async-timeout==4.0.2
attrs==23.1.0
Babel==2.13.0
backcall==0.2.0
beautifulsoup4==4.12.2
black==23.10.1
bleach==6.0.0
botocore==1.31.17
Bottleneck==1.3.7
cachetools==5.3.1
certifi==2023.7.22
cffi==1.16.0
charset-normalizer==3.3.0
click==8.1.6
cloudpickle==2.2.1
colorama==0.4.6
comm==0.1.4
connectorx==0.3.2
contourpy==1.0.7
cryptography==41.0.4
cycler==0.11.0
DatastreamPy==1.0.12
DateTime==5.2
db-dtypes==1.1.1
debugpy==1.8.0
decorator==5.1.1
defusedxml==0.7.1
deltalake==0.10.1
distro==1.8.0
et-xmlfile==1.1.0
exceptiongroup==1.1.1
executing==2.0.0
fastjsonschema==2.18.1
filelock==3.12.0
fonttools==4.43.0
fqdn==1.5.1
frozenlist==1.3.3
fsspec==2023.9.2
gcsfs==2023.5.0
gevent==23.9.1
googleapis-common-protos==1.61.0
greenlet==3.0.1
grpcio==1.59.2
grpcio-status==1.59.2
h11==0.14.0
html5lib==1.1
httpcore==1.0.0
httpx==0.25.0
idna==3.4
ipykernel==6.26.0
ipython==8.17.2
ipython-genutils==0.2.0
ipywidgets==8.1.1
isoduration==20.11.0
jedi==0.19.1
Jinja2==3.1.2
jmespath==1.0.1
JPype1==1.4.1
json5==0.9.14
jsonpointer==2.4
jsonschema==4.19.1
jsonschema-specifications==2023.7.1
jupyter==1.0.0
jupyter-console==6.6.3
jupyter-events==0.9.0
jupyter-lsp==2.2.0
jupyter_client==8.6.0
jupyter_core==5.5.0
jupyter_server==2.10.0
jupyter_server_terminals==0.4.4
jupyterlab==4.0.8
jupyterlab-pygments==0.2.2
jupyterlab-widgets==3.0.9
jupyterlab_server==2.25.0
kiwisolver==1.4.5
llvmlite==0.41.0
lxml==4.9.3
markdown-it-py==3.0.0
MarkupSafe==2.1.3
matplotlib==3.8.1
matplotlib-inline==0.1.6
mdurl==0.1.2
mistune==3.0.2
multidict==6.0.4
mypy-extensions==1.0.0
nbclassic==1.0.0
nbclient==0.8.0
nbconvert==7.11.0
nbformat==5.9.2
nest-asyncio==1.5.8
notebook==7.0.6
notebook_shim==0.2.3
numba==0.58.1
numexpr==2.8.7
numpy==1.25.2
oauthlib==3.2.2
odfpy==1.4.1
opencv-python==4.8.1.78
openpyxl==3.1.2
oscrypto==1.3.0
outcome==1.2.0
overrides==7.4.0
packaging==23.2
pandas==2.1.2
pandocfilters==1.5.0
parso==0.8.3
pathspec==0.11.1
patsy==0.5.3
pendulum==2.1.2
pickleshare==0.7.5
Pillow==10.1.0
pip-system-certs==4.0
platformdirs==3.8.1
polars==0.19.12
prometheus-client==0.17.1
prompt-toolkit==3.0.39
proto-plus==1.22.3
protobuf==4.23.1
psutil==5.9.5
pure-eval==0.2.2
pyarrow==13.0.0
pyasn1==0.5.0
pyasn1-modules==0.3.0
pybrowsers==0.5.2
pycparser==2.21
pycryptodomex==3.18.0
pydantic==2.4.2
pydantic_core==2.10.1
pydata-google-auth==1.8.0
pyee==9.0.4
Pygments==2.16.1
pyhumps==3.0.2
PyJWT==2.7.0
PyMySQL==1.0.3
pyodbc==5.0.1
pyOpenSSL==23.2.0
pyparsing==3.1.1
PyPDF2==3.0.1
pyrsistent==0.19.3
PySocks==1.7.1
python-dateutil==2.8.2
python-dotenv==1.0.0
python-json-logger==2.0.7
pytz==2023.3
pytzdata==2020.1
pywin32==305
pywinpty==2.0.11
pyxlsb==1.0.10
PyYAML==6.0.1
pyzmq==25.1.1
qtconsole==5.4.3
QtPy==2.3.1
referencing==0.30.2
requests==2.31.0
requests-oauthlib==1.3.1
rfc3339-validator==0.1.4
rfc3986==2.0.0
rfc3986-validator==0.1.1
rich==13.5.3
rpds-py==0.10.4
rsa==4.9
s3fs==2023.9.2
scipy==1.11.3
seaborn==0.13.0
selenium==4.14.0
Send2Trash==1.8.2
setuptools-scm==8.0.4
simplejson==3.19.1
six==1.16.0
sniffio==1.3.0
snowflake-connector-python==3.4.0
snowflake-snowpark-python==1.10.0
sortedcontainers==2.4.0
soupsieve==2.5
SQLAlchemy==2.0.22
sqlglot==19.0.1
stack-data==0.6.3
statsmodels==0.14.0
strictyaml==1.7.3
tabula-py==2.8.2
tabulate==0.9.0
tenacity==8.0.1
terminado==0.17.1
tinycss2==1.2.1
tomlkit==0.12.1
tornado==6.3.3
tqdm==4.66.1
traitlets==5.11.2
trio==0.22.0
trio-websocket==0.10.2
types-python-dateutil==2.8.19.14
typing_extensions==4.8.0
tzdata==2023.3
uri-template==1.3.0
urllib3==1.26.17
watchdog==2.1.9
wcwidth==0.2.8
webcolors==1.13
webdriver-manager==4.0.1
webencodings==0.5.1
websocket-client==1.6.3
widgetsnbextension==4.0.9
wincertstore==0.2
wrapt==1.15.0
wsproto==1.2.0
xarray==2023.10.1
xlrd==2.0.1
xlsx2csv==0.8.1
XlsxWriter==3.1.9
yarl==1.9.2
zope.event==5.0
zope.interface==6.0

What did you do?

Just upgraded from 3.3.0b1 to 3.4.0 and when I import the connector, I get a UserWarning, specifcially 

"C:\~\Lib\site-packages\snowflake\connector\options.py:103: UserWarning: You have an incompatible version of 'pyarrow' installed (13.0.0), please install a version that adheres to: 'pyarrow<10.1.0,>=10.0.1; extra == "pandas"'
  warn_incompatible_dep(
Failed to import ArrowResult. No Apache Arrow result set format can be used. ImportError: DLL load failed while importing arrow_iterator: The specified procedure could not be found."

Never get this error when using 3.3.0b1

What did you expect to see?

No UserWarning

Can you set logging to DEBUG and collect the logs?

import logging
import os

for logger_name in ('snowflake.connector',):
    logger = logging.getLogger(logger_name)
    logger.setLevel(logging.DEBUG)
    ch = logging.StreamHandler()
    ch.setLevel(logging.DEBUG)
    ch.setFormatter(logging.Formatter('%(asctime)s - %(threadName)s %(filename)s:%(lineno)d - %(funcName)s() - %(levelname)s - %(message)s'))
    logger.addHandler(ch)
sfc-gh-achandrasekaran commented 10 months ago

Hello, yes the 3.3.0b1 is a beta version that has an experimental arrow library that removes the pinning on a pyarrow version. 3.4.0 is a production version that does not have this change. We plan on releasing a productionized version of 3.3.0b1 this week so please keep an eye out for that. In the meantime, please use a compatible pyarrow version for 3.4.0

sfc-gh-aling commented 10 months ago

hi @matquant14, we have just released v3.5.0 which stops pinning pyarrow anymore. could you try installing the latest connector and let us know how it goes for you

sfc-gh-achandrasekaran commented 10 months ago

Fixed with 3.5.0

matquant14 commented 10 months ago

Hi @sfc-gh-aling and @sfc-gh-achandrasekaran,

I updated to v3.5.0. I'm getting a NotSupportedError when I try to run a SHOW object command with a cursor and fetch it with either pandas or arrow. It seems to occur in the fetch_pandas_all and fetch_arrow_all cursor methods. I believe it has to do w/ the query format returning JSON, but the method requires the format to be arrow. If I just do fetchall, and manually place the rows into a pandas and/or polars dataframe using from_records, it works.

# fails
cur = connection.cursor()
cursor.execute("SHOW TABLES")
df  = cur.fetch_pandas_all() 
# -> NotSupportedError: Unknown error

# works
cur = connection.cursor()
cursor.execute("SHOW TABLES")
col_names= [val[0] for val in cur.description]
df = pd.DataFrame.from_records(cur.fetchall(),columns=col_names)

Any idea how I can fix this?