snowflakedb / snowpark-python

Snowflake Snowpark Python API
Apache License 2.0
269 stars 111 forks source link

SNOW-709591: functions.get should accept integers #636

Closed orellabac closed 1 year ago

orellabac commented 1 year ago
  1. What version of Python are you using?

    Python 3.8.12 (default, Oct 12 2021, 13:49:34) [GCC 7.5.0]

  2. What operating system and processor architecture are you using?

    Linux-5.4.0-1094-azure-x86_64-with-glibc2.17

  3. What are the component versions in the environment (aiohttp==3.8.3 aiosignal==1.3.1 alembic==1.8.1 altair==4.2.0 anyio==3.6.2 apache-airflow-providers-common-sql==1.3.1 apache-airflow-providers-ftp==3.2.0 apache-airflow-providers-http==4.1.0 apache-airflow-providers-imap==3.1.0 apache-airflow-providers-sqlite==3.3.1 apispec==3.3.2 argcomplete==2.0.0 argon2-cffi==21.3.0 argon2-cffi-bindings==21.2.0 asn1crypto==1.5.1 astor==0.8.1 astroid==2.9.3 asttokens==2.1.0 async-generator==1.10 async-timeout==4.0.2 attrs==22.1.0 Babel==2.11.0 backcall==0.2.0 backports.zoneinfo==0.2.1 bds-testing==0.0.1 beautifulsoup4==4.11.1 bleach==5.0.1 blinker==1.5 brotlipy==0.7.0 cachelib==0.9.0 cachetools==5.2.0 cattrs==22.2.0 certifi==2021.10.8 certipy==0.1.3 cffi==1.15.0 chardet==3.0.4 charset-normalizer==2.0.4 click==8.0.4 clickclick==20.10.2 cloudpickle==2.0.0 colorama==0.4.6 colorlog==4.8.0 commonmark==0.9.1 conda==4.11.0 conda-content-trust==0+unknown conda-package-handling==1.7.3 ConfigUpdater==3.1.1 connexion==2.14.1 coverage==6.5.0 cron-descriptor==1.2.32 croniter==1.3.8 cryptography==36.0.0 dbus-python==1.2.16 debugpy==1.6.3 decorator==5.1.1 defusedxml==0.7.1 Deprecated==1.2.13 dill==0.3.1.1 distro-info===0.23ubuntu1 dnspython==2.2.1 docutils==0.19 email-validator==1.3.0 entrypoints==0.4 et-xmlfile==1.1.0 eventlet==0.33.2 exceptiongroup==1.0.4 executing==1.2.0 fastjsonschema==2.16.2 filelock==3.8.0 findspark==2.0.1 Flask==2.2.2 Flask-AppBuilder==4.1.4 Flask-Babel==2.0.0 Flask-Caching==2.0.1 Flask-JWT-Extended==4.4.4 Flask-Login==0.6.2 Flask-SQLAlchemy==2.5.1 Flask-WTF==1.0.1 frozenlist==1.3.3 fuzzywuzzy==0.18.0 gevent==22.10.2 gitdb==4.0.10 GitPython==3.1.29 graphviz==0.20.1 greenlet==2.0.1 gunicorn==20.1.0 h11==0.14.0 h3==3.7.4 httpcore==0.16.2 httpx==0.23.1 idna==3.3 importlib-metadata==5.1.0 importlib-resources==5.10.0 inflection==0.5.1 iniconfig==1.1.1 ipykernel==6.17.1 ipython==8.6.0 ipython-genutils==0.2.0 ipywidgets==8.0.2 isodate==0.6.1 isort==5.10.1 itsdangerous==2.1.2 jedi==0.18.2 Jinja2==3.1.2 jsonschema==4.17.1 jupyter==1.0.0 jupyter-client==7.4.7 jupyter-console==6.4.4 jupyter-core==5.0.0 jupyter-server-proxy==1.5.2 jupyter-telemetry==0.1.0 jupyter-vscode-proxy==0.1 jupyterhub==1.4.2 jupyterlab-pygments==0.2.2 jupyterlab-widgets==3.0.3 lazy-object-proxy==1.8.0 Levenshtein==0.20.8 linkify-it-py==2.0.0 lockfile==0.12.2 lxml==4.9.1 Mako==1.2.4 Markdown==3.4.1 markdown-it-py==2.1.0 MarkupSafe==2.1.1 marshmallow==3.19.0 marshmallow-enum==1.5.1 marshmallow-oneofschema==3.0.1 marshmallow-sqlalchemy==0.26.1 matplotlib-inline==0.1.6 mccabe==0.6.1 mdit-py-plugins==0.3.1 mdurl==0.1.2 metakernel==0.29.2 mistune==2.0.4 msal==1.20.0 multidict==6.0.2 nbclient==0.7.0 nbconvert==7.2.5 nbformat==5.7.0 nest-asyncio==1.5.6 networkx==2.8.8 notebook==6.4.1 numpy==1.23.5 oauthlib==3.2.2 openpyxl==3.0.10 oscrypto==1.3.0 packaging==21.3 pamela==1.0.0 pandas==1.5.2 pandocfilters==1.5.0 parso==0.8.3 pathspec==0.9.0 pendulum==2.1.2 pexpect==4.8.0 pickleshare==0.7.5 Pillow==9.3.0 pkgutil-resolve-name==1.3.10 platformdirs==2.5.4 plotly==5.11.0 pluggy==1.0.0 prison==0.2.1 prometheus-client==0.15.0 prompt-toolkit==3.0.33 protobuf==4.21.9 proxy.py==2.4.3 psutil==5.9.4 ptyprocess==0.7.0 pure-eval==0.2.2 py==1.11.0 py4j==0.10.9.5 pyarrow==8.0.0 pycobertura==3.0.0 pycosat==0.6.3 pycparser==2.21 pycryptodomex==3.15.0 pydeck==0.8.0 pydotplus==2.0.2 Pygments==2.13.0 PyGObject==3.36.0 PyJWT==2.6.0 pylint==2.12.2 Pympler==1.0.1 pyngrok==5.2.1 pyodbc==4.0.35 pyOpenSSL==21.0.0 pyparsing==3.0.9 pyrsistent==0.19.2 PySocks==1.7.1 pyspark==3.3.1 pytest==6.2.5 pytest-csv==3.0.0 pytest-excel==1.5.0 python-apt==2.0.0+ubuntu0.20.4.8 python-daemon==2.3.2 python-dateutil==2.8.2 python-decouple==3.6 python-json-logger==2.0.4 python-Levenshtein==0.20.8 python-nvd3==0.15.0 python-slugify==7.0.0 pytz==2022.6 pytz-deprecation-shim==0.1.0.post0 pytzdata==2020.1 PyYAML==6.0 pyzmq==24.0.1 qtconsole==5.4.0 QtPy==2.3.0 rapidfuzz==2.13.3 rdflib==6.2.0 requests==2.27.1 requests-toolbelt==0.10.1 requests-unixsocket==0.2.0 requirements-parser==0.5.0 rfc3986==1.5.0 rich==12.6.0 ruamel-yaml-conda==0.15.100 ruamel.yaml==0.17.21 ruamel.yaml.clib==0.2.7 semver==2.13.0 Send2Trash==1.8.0 setproctitle==1.3.2 shortuuid==1.0.11 simpervisor==0.4 six==1.16.0 smmap==5.0.0 sniffio==1.3.0 snowconvert-deploy-tool==0.0.20 snowconvert-helpers==2.0.14 snowflake-cli-labs==0.1.8 snowflake-connector-python==2.8.2 snowflake-snowpark-python==1.0.0 snowflake-sqlalchemy==1.4.4 snowpark-extensions==0.0.6 soupsieve==2.3.2.post1 spylon==0.3.0 spylon-kernel==0.4.1 SQLAlchemy==1.4.44 SQLAlchemy-JSONField==1.0.0 SQLAlchemy-Utils==0.38.3 sqlparse==0.4.3 stack-data==0.6.1 streamlit==1.8.1 streamlit-ace==0.1.1 streamlit-aggrid==0.3.3 streamlit-agraph==0.0.42 streamlit-option-menu==0.3.2 swagger-ui-bundle==0.0.9 tabulate==0.9.0 tenacity==8.1.0 termcolor==2.1.1 terminado==0.17.0 text-unidecode==1.3 tinycss2==1.2.1 toml==0.10.2 toolz==0.12.0 tornado==6.2 tqdm==4.62.3 traitlets==5.5.0 typer==0.7.0 types-setuptools==65.6.0.2 typing-extensions==4.4.0 tzdata==2022.6 tzlocal==4.2 uc-micro-py==1.0.1 unattended-upgrades==0.1 unicodecsv==0.14.1 urllib3==1.26.7 validators==0.20.0 watchdog==2.1.9 wcwidth==0.2.5 webencodings==0.5.1 Werkzeug==2.2.2 widgetsnbextension==4.0.3 wrapt==1.13.3 WTForms==3.0.1 xlrd==2.0.1 yarl==1.8.1 zipp==3.11.0 zope.event==4.5.0 zope.interface==5.5.2)?

    aiohttp==3.8.3 aiosignal==1.3.1 alembic==1.8.1 altair==4.2.0 anyio==3.6.2 apache-airflow-providers-common-sql==1.3.1 apache-airflow-providers-ftp==3.2.0 apache-airflow-providers-http==4.1.0 apache-airflow-providers-imap==3.1.0 apache-airflow-providers-sqlite==3.3.1 apispec==3.3.2 argcomplete==2.0.0 argon2-cffi==21.3.0 argon2-cffi-bindings==21.2.0 asn1crypto==1.5.1 astor==0.8.1 astroid==2.9.3 asttokens==2.1.0 async-generator==1.10 async-timeout==4.0.2 attrs==22.1.0 Babel==2.11.0 backcall==0.2.0 backports.zoneinfo==0.2.1 bds-testing @ file:///bds_testing-0.0.1-py3-none-any.whl beautifulsoup4==4.11.1 bleach==5.0.1 blinker==1.5 brotlipy==0.7.0 cachelib==0.9.0 cachetools==5.2.0 cattrs==22.2.0 certifi==2021.10.8 certipy==0.1.3 cffi @ file:///opt/conda/conda-bld/cffi_1642701102775/work charset-normalizer @ file:///tmp/build/80754af9/charset-normalizer_1630003229654/work click==8.0.4 clickclick==20.10.2 cloudpickle==2.0.0 colorama==0.4.6 colorlog==4.8.0 commonmark==0.9.1 conda==4.11.0 conda-content-trust @ file:///tmp/build/80754af9/conda-content-trust_1617045594566/work conda-package-handling @ file:///tmp/build/80754af9/conda-package-handling_1618262148928/work ConfigUpdater==3.1.1 connexion==2.14.1 coverage==6.5.0 cron-descriptor==1.2.32 croniter==1.3.8 cryptography @ file:///tmp/build/80754af9/cryptography_1639400846433/work debugpy==1.6.3 decorator==5.1.1 defusedxml==0.7.1 Deprecated==1.2.13 dill==0.3.1.1 dnspython==2.2.1 docutils==0.19 email-validator==1.3.0 entrypoints==0.4 et-xmlfile==1.1.0 eventlet==0.33.2 exceptiongroup==1.0.4 executing==1.2.0 fastjsonschema==2.16.2 filelock==3.8.0 findspark==2.0.1 Flask==2.2.2 Flask-AppBuilder==4.1.4 Flask-Babel==2.0.0 Flask-Caching==2.0.1 Flask-JWT-Extended==4.4.4 Flask-Login==0.6.2 Flask-SQLAlchemy==2.5.1 Flask-WTF==1.0.1 frozenlist==1.3.3 fuzzywuzzy==0.18.0 gevent==22.10.2 gitdb==4.0.10 GitPython==3.1.29 graphviz==0.20.1 greenlet==2.0.1 gunicorn==20.1.0 h11==0.14.0 h3==3.7.4 httpcore==0.16.2 httpx==0.23.1 idna @ file:///tmp/build/80754af9/idna_1637925883363/work importlib-metadata==5.1.0 importlib-resources==5.10.0 inflection==0.5.1 iniconfig==1.1.1 ipykernel==6.17.1 ipython==8.6.0 ipython-genutils==0.2.0 ipywidgets==8.0.2 isodate==0.6.1 isort==5.10.1 itsdangerous==2.1.2 jedi==0.18.2 Jinja2==3.1.2 jsonschema==4.17.1 jupyter==1.0.0 jupyter-console==6.4.4 jupyter-server-proxy==1.5.2 jupyter-telemetry==0.1.0 jupyter-vscode-proxy==0.1 jupyter_client==7.4.7 jupyter_core==5.0.0 jupyterhub==1.4.2 jupyterlab-pygments==0.2.2 jupyterlab-widgets==3.0.3 lazy-object-proxy==1.8.0 Levenshtein==0.20.8 linkify-it-py==2.0.0 lockfile==0.12.2 lxml==4.9.1 Mako==1.2.4 Markdown==3.4.1 markdown-it-py==2.1.0 MarkupSafe==2.1.1 marshmallow==3.19.0 marshmallow-enum==1.5.1 marshmallow-oneofschema==3.0.1 marshmallow-sqlalchemy==0.26.1 matplotlib-inline==0.1.6 mccabe==0.6.1 mdit-py-plugins==0.3.1 mdurl==0.1.2 metakernel==0.29.2 mistune==2.0.4 msal==1.20.0 multidict==6.0.2 nbclient==0.7.0 nbconvert==7.2.5 nbformat==5.7.0 nest-asyncio==1.5.6 networkx==2.8.8 notebook==6.4.1 numpy==1.23.5 oauthlib==3.2.2 openpyxl==3.0.10 oscrypto==1.3.0 packaging==21.3 pamela==1.0.0 pandas==1.5.2 pandocfilters==1.5.0 parso==0.8.3 pathspec==0.9.0 pendulum==2.1.2 pexpect==4.8.0 pickleshare==0.7.5 Pillow==9.3.0 pkgutil_resolve_name==1.3.10 platformdirs==2.5.4 plotly==5.11.0 pluggy==1.0.0 prison==0.2.1 prometheus-client==0.15.0 prompt-toolkit==3.0.33 protobuf==4.21.9 proxy.py==2.4.3 psutil==5.9.4 ptyprocess==0.7.0 pure-eval==0.2.2 py==1.11.0 py4j==0.10.9.5 pyarrow==8.0.0 pycobertura==3.0.0 pycosat==0.6.3 pycparser @ file:///tmp/build/80754af9/pycparser_1636541352034/work pycryptodomex==3.15.0 pydeck==0.8.0 pydotplus==2.0.2 Pygments==2.13.0 PyJWT==2.6.0 pylint==2.12.2 Pympler==1.0.1 pyngrok==5.2.1 pyodbc==4.0.35 pyOpenSSL @ file:///tmp/build/80754af9/pyopenssl_1635333100036/work pyparsing==3.0.9 pyrsistent==0.19.2 PySocks @ file:///tmp/build/80754af9/pysocks_1605305779399/work pyspark==3.3.1 pytest==6.2.5 pytest-csv==3.0.0 pytest-excel==1.5.0 python-daemon==2.3.2 python-dateutil==2.8.2 python-decouple==3.6 python-json-logger==2.0.4 python-Levenshtein==0.20.8 python-nvd3==0.15.0 python-slugify==7.0.0 pytz==2022.6 pytz-deprecation-shim==0.1.0.post0 pytzdata==2020.1 PyYAML==6.0 pyzmq==24.0.1 qtconsole==5.4.0 QtPy==2.3.0 rapidfuzz==2.13.3 rdflib==6.2.0 requests @ file:///opt/conda/conda-bld/requests_1641824580448/work requests-toolbelt==0.10.1 requirements-parser==0.5.0 rfc3986==1.5.0 rich==12.6.0 ruamel-yaml-conda @ file:///tmp/build/80754af9/ruamel_yaml_1616016699510/work ruamel.yaml==0.17.21 ruamel.yaml.clib==0.2.7 semver==2.13.0 Send2Trash==1.8.0 setproctitle==1.3.2 shortuuid==1.0.11 simpervisor==0.4 six @ file:///tmp/build/80754af9/six_1623709665295/work smmap==5.0.0 sniffio==1.3.0 snowconvert-deploy-tool==0.0.20 snowconvert-helpers==2.0.14 snowflake-cli-labs==0.1.8 snowflake-connector-python==2.8.2 snowflake-snowpark-python==1.0.0 snowflake-sqlalchemy==1.4.4 snowpark-extensions==0.0.6 soupsieve==2.3.2.post1 spylon==0.3.0 spylon-kernel==0.4.1 SQLAlchemy==1.4.44 SQLAlchemy-JSONField==1.0.0 SQLAlchemy-Utils==0.38.3 sqlparse==0.4.3 stack-data==0.6.1 streamlit==1.8.1 streamlit-ace==0.1.1 streamlit-aggrid==0.3.3 streamlit-agraph==0.0.42 streamlit-option-menu==0.3.2 swagger-ui-bundle==0.0.9 tabulate==0.9.0 tenacity==8.1.0 termcolor==2.1.1 terminado==0.17.0 text-unidecode==1.3 tinycss2==1.2.1 toml==0.10.2 toolz==0.12.0 tornado==6.2 tqdm @ file:///tmp/build/80754af9/tqdm_1635330843403/work traitlets==5.5.0 typer==0.7.0 types-setuptools==65.6.0.2 typing_extensions==4.4.0 tzdata==2022.6 tzlocal==4.2 uc-micro-py==1.0.1 unicodecsv==0.14.1 urllib3==1.26.7 validators==0.20.0 watchdog==2.1.9 wcwidth==0.2.5 webencodings==0.5.1 Werkzeug==2.2.2 widgetsnbextension==4.0.3 wrapt==1.13.3 WTForms==3.0.1 xlrd==2.0.1 yarl==1.8.1 zipp==3.11.0 zope.event==4.5.0 zope.interface==5.5.2

  4. What did you do?

from snowflake.snowpark import Session from snowflake.snowpark.functions import get, lit import snowpark_extensions

session = Session.builder.from_snowsql().getOrCreate()

df = session.createDataFrame([(["a", "b", "c"],), ([],)], ['data']) res=df.select(get(df.data, 0)).collect() print(res)

And I got:

    res=df.select(get(df.data, 0)).collect()
  File "/opt/conda/lib/python3.8/site-packages/snowflake/snowpark/functions.py", line 2693, in get
    c2 = _to_col_if_str(col2, "get")
  File "/opt/conda/lib/python3.8/site-packages/snowflake/snowpark/column.py", line 107, in _to_col_if_str
    raise TypeError(
TypeError: 'GET' expected Column or str, got: <class 'int'>
  1. What did you expect to see?
    [Row(GET("DATA", 0)='"a"'), Row(GET("DATA", 0)=None)]
sfc-gh-aalam commented 1 year ago

try this res=df.select(get(df.data, lit(0))).collect()

orellabac commented 1 year ago

We are aware of this workaround but for users coming from spark it requires changing their code, couldn’t we consider having the snowpark get doing this automatically if the index is an integer literal ?

sfc-gh-aalam commented 1 year ago

Hey @orellabac, I'm working on adding this support. Shouldn't be too difficult. Hopefully it is present in the next release :)