I installed snowpark and pandas separately. this code work in mac. However when I run the code in Amazon Linux 2 it fails.
import os
from tokenize import String
import pandas as pd
import json
from io import StringIO
from datetime import datetime
from botocore.exceptions import ClientError
from snowflake.snowpark import Session
from snowflake.snowpark.types import StructType, StructField, StringType, IntegerType
from snowflake.connector.pandas_tools import write_pandas
from snowflake.snowpark.functions import when_matched, when_not_matched
from snowflake.snowpark.functions import col, lit
What are the component versions in the environment (pip freeze)?
asn1crypto==1.5.1
boto3==1.24.53
botocore==1.27.53
certifi==2022.6.15
cffi==1.15.1
charset-normalizer==2.1.0
cloudpickle==2.0.0
cryptography==36.0.2
idna==3.3
jmespath==1.0.1
numpy==1.23.2
oscrypto==1.3.0
pandas==1.4.3
pycparser==2.21
pycryptodomex==3.15.0
PyJWT==2.4.0
pyOpenSSL==22.0.0
python-dateutil==2.8.2
pytz==2022.2.1
requests==2.28.1
s3transfer==0.6.0
six==1.16.0
snowflake-connector-python==2.7.11
snowflake-snowpark-python==0.8.0
typing_extensions==4.3.0
urllib3==1.26.11
What did you do?
Open your AWS Cloud9 Amazon EC2 environment.
Install Python 3.8 and pip3 by running the following commands:
$ sudo amazon-linux-extras install python3.8
$ curl -O https://bootstrap.pypa.io/get-pip.py
$ python3.8 get-pip.py --user
Create a python folder by running the following command:
python3.8 -m pip install snowflake-snowpark-python -t python/ --upgrade
python3.8 -m pip install pandas -t python/ --upgrade
What did you expect to see?
Pandas and snowpark to work on Linux 2 as I need to run this code in AWS Lambda
Can you set logging to DEBUG and collect the logs?
Response
{
"errorMessage": "create_dataframe() function only accepts data as a list, tuple or a pandas DataFrame.",
"errorType": "TypeError",
"stackTrace": [
" File \"/var/lang/lib/python3.8/imp.py\", line 234, in load_module\n return load_source(name, filename, file)\n",
" File \"/var/lang/lib/python3.8/imp.py\", line 171, in load_source\n module = _load(spec)\n",
" File \"\", line 702, in _load\n",
" File \"\", line 671, in _load_unlocked\n",
" File \"\", line 843, in exec_module\n",
" File \"\", line 219, in _call_with_frames_removed\n",
" File \"/var/task/lambda_function.py\", line 81, in \n stage()\n",
" File \"/var/task/lambda_function.py\", line 62, in stage\n write_to_snowflake(df)\n",
" File \"/var/task/lambda_function.py\", line 45, in write_to_snowflake\n df=session.create_dataframe(cdf)\n",
" File \"/opt/python/snowflake/snowpark/session.py\", line 1133, in create_dataframe\n raise TypeError(\n"
]
}
I installed snowpark and pandas separately. this code work in mac. However when I run the code in Amazon Linux 2 it fails.
import os from tokenize import String import pandas as pd import json from io import StringIO from datetime import datetime from botocore.exceptions import ClientError from snowflake.snowpark import Session from snowflake.snowpark.types import StructType, StructField, StringType, IntegerType from snowflake.connector.pandas_tools import write_pandas from snowflake.snowpark.functions import when_matched, when_not_matched from snowflake.snowpark.functions import col, lit
connection_parameters = { }
def write_to_snowflake(cdf): session = Session.builder.configs(connection_parameters).create() df=session.create_dataframe(cdf)
df.write.mode("overwrite").save_as_table("cdp_staging", table_type="temporary") print("after stage") session.close()
def stage(): data = { "calories": [420, 380, 390], "duration": [50, 40, 45] } df = pd.DataFrame(data)
write_to_snowflake(df)
stage()
At run time I get an error "create_dataframe() function only accepts data as a list, tuple or a pandas DataFrame."
Does it need specific version of pandas? Thanks!
What version of Python are you using? 3.8
What operating system and processor architecture are you using?
Linux-4.14.287-215.504.amzn2.x86_64-x86_64-with-glibc2.2.5
What are the component versions in the environment (pip freeze)? asn1crypto==1.5.1 boto3==1.24.53 botocore==1.27.53 certifi==2022.6.15 cffi==1.15.1 charset-normalizer==2.1.0 cloudpickle==2.0.0 cryptography==36.0.2 idna==3.3 jmespath==1.0.1 numpy==1.23.2 oscrypto==1.3.0 pandas==1.4.3 pycparser==2.21 pycryptodomex==3.15.0 PyJWT==2.4.0 pyOpenSSL==22.0.0 python-dateutil==2.8.2 pytz==2022.2.1 requests==2.28.1 s3transfer==0.6.0 six==1.16.0 snowflake-connector-python==2.7.11 snowflake-snowpark-python==0.8.0 typing_extensions==4.3.0 urllib3==1.26.11
What did you do? Open your AWS Cloud9 Amazon EC2 environment. Install Python 3.8 and pip3 by running the following commands: $ sudo amazon-linux-extras install python3.8 $ curl -O https://bootstrap.pypa.io/get-pip.py $ python3.8 get-pip.py --user Create a python folder by running the following command: python3.8 -m pip install snowflake-snowpark-python -t python/ --upgrade python3.8 -m pip install pandas -t python/ --upgrade
What did you expect to see? Pandas and snowpark to work on Linux 2 as I need to run this code in AWS Lambda
Can you set logging to DEBUG and collect the logs? Response { "errorMessage": "create_dataframe() function only accepts data as a list, tuple or a pandas DataFrame.", "errorType": "TypeError", "stackTrace": [ " File \"/var/lang/lib/python3.8/imp.py\", line 234, in load_module\n return load_source(name, filename, file)\n", " File \"/var/lang/lib/python3.8/imp.py\", line 171, in load_source\n module = _load(spec)\n", " File \"\", line 702, in _load\n",
" File \"\", line 671, in _load_unlocked\n",
" File \"\", line 843, in exec_module\n",
" File \"\", line 219, in _call_with_frames_removed\n",
" File \"/var/task/lambda_function.py\", line 81, in \n stage()\n",
" File \"/var/task/lambda_function.py\", line 62, in stage\n write_to_snowflake(df)\n",
" File \"/var/task/lambda_function.py\", line 45, in write_to_snowflake\n df=session.create_dataframe(cdf)\n",
" File \"/opt/python/snowflake/snowpark/session.py\", line 1133, in create_dataframe\n raise TypeError(\n"
]
}