snowflakedb / snowpark-python

Snowflake Snowpark Python API
Apache License 2.0
267 stars 110 forks source link

SNOW-647070: Installing Pandas and Snowflake separately #443

Closed kamipatel closed 2 years ago

kamipatel commented 2 years ago

I installed snowpark and pandas separately. this code work in mac. However when I run the code in Amazon Linux 2 it fails.

import os from tokenize import String import pandas as pd import json from io import StringIO from datetime import datetime from botocore.exceptions import ClientError from snowflake.snowpark import Session from snowflake.snowpark.types import StructType, StructField, StringType, IntegerType from snowflake.connector.pandas_tools import write_pandas from snowflake.snowpark.functions import when_matched, when_not_matched from snowflake.snowpark.functions import col, lit

connection_parameters = { }

def write_to_snowflake(cdf): session = Session.builder.configs(connection_parameters).create() df=session.create_dataframe(cdf)
df.write.mode("overwrite").save_as_table("cdp_staging", table_type="temporary") print("after stage") session.close()

def stage(): data = { "calories": [420, 380, 390], "duration": [50, 40, 45] } df = pd.DataFrame(data)
write_to_snowflake(df)

stage()

At run time I get an error "create_dataframe() function only accepts data as a list, tuple or a pandas DataFrame."

Does it need specific version of pandas? Thanks!

What version of Python are you using? 3.8

What operating system and processor architecture are you using?

Linux-4.14.287-215.504.amzn2.x86_64-x86_64-with-glibc2.2.5

What are the component versions in the environment (pip freeze)? asn1crypto==1.5.1 boto3==1.24.53 botocore==1.27.53 certifi==2022.6.15 cffi==1.15.1 charset-normalizer==2.1.0 cloudpickle==2.0.0 cryptography==36.0.2 idna==3.3 jmespath==1.0.1 numpy==1.23.2 oscrypto==1.3.0 pandas==1.4.3 pycparser==2.21 pycryptodomex==3.15.0 PyJWT==2.4.0 pyOpenSSL==22.0.0 python-dateutil==2.8.2 pytz==2022.2.1 requests==2.28.1 s3transfer==0.6.0 six==1.16.0 snowflake-connector-python==2.7.11 snowflake-snowpark-python==0.8.0 typing_extensions==4.3.0 urllib3==1.26.11

What did you do? Open your AWS Cloud9 Amazon EC2 environment. Install Python 3.8 and pip3 by running the following commands: $ sudo amazon-linux-extras install python3.8 $ curl -O https://bootstrap.pypa.io/get-pip.py $ python3.8 get-pip.py --user Create a python folder by running the following command: python3.8 -m pip install snowflake-snowpark-python -t python/ --upgrade python3.8 -m pip install pandas -t python/ --upgrade

What did you expect to see? Pandas and snowpark to work on Linux 2 as I need to run this code in AWS Lambda

Can you set logging to DEBUG and collect the logs? Response { "errorMessage": "create_dataframe() function only accepts data as a list, tuple or a pandas DataFrame.", "errorType": "TypeError", "stackTrace": [ " File \"/var/lang/lib/python3.8/imp.py\", line 234, in load_module\n return load_source(name, filename, file)\n", " File \"/var/lang/lib/python3.8/imp.py\", line 171, in load_source\n module = _load(spec)\n", " File \"\", line 702, in _load\n", " File \"\", line 671, in _load_unlocked\n", " File \"\", line 843, in exec_module\n", " File \"\", line 219, in _call_with_frames_removed\n", " File \"/var/task/lambda_function.py\", line 81, in \n stage()\n", " File \"/var/task/lambda_function.py\", line 62, in stage\n write_to_snowflake(df)\n", " File \"/var/task/lambda_function.py\", line 45, in write_to_snowflake\n df=session.create_dataframe(cdf)\n", " File \"/opt/python/snowflake/snowpark/session.py\", line 1133, in create_dataframe\n raise TypeError(\n" ] }

kamipatel commented 2 years ago

I found the issue. It need pyarrow separately. All good. thanks