snowflakedb / snowflake-connector-python

Snowflake Connector for Python
https://pypi.python.org/pypi/snowflake-connector-python/
Apache License 2.0
601 stars 473 forks source link

SNOW-1654256: `Not all arguments converted during string formatting` sqlalchemy + snowflake ❄️ #2047

Closed doma-rydevera3 closed 4 weeks ago

doma-rydevera3 commented 2 months ago

Python version

3.11.9

Operating system and processor architecture

Linux-6.5.0-1025-azure-x86_64-with-glibc2.36

Installed packages

aiobotocore==2.11.2
aiohttp==3.9.3
aioitertools==0.11.0
aiosignal==1.3.1
annotated-types==0.6.0
anyio==4.2.0
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arrow==1.3.0
asn1crypto==1.5.1
asttokens==2.4.1
async-lru==2.0.4
atomicwrites==1.4.1
attrs==23.2.0
azure-core==1.30.0
azure-identity==1.15.0
azure-storage-blob==12.19.0
Babel==2.14.0
backend==0.2.4.1
bcrypt==4.1.2
beautifulsoup4==4.12.3
black==24.1.1
bleach==6.1.0
blinker==1.7.0
botocore==1.34.34
category-encoders==2.6.3
cattrs==23.2.3
certifi==2024.2.2
cffi==1.16.0
chardet==5.2.0
charset-normalizer==3.3.2
click==8.1.7
cloudpickle==3.0.0
comm==0.2.1
contourpy==1.2.0
cryptography==41.0.7
cycler==0.12.1
Cython==3.0.8
debugpy==1.8.1
decorator==5.1.1
defusedxml==0.7.1
dill==0.3.8
editdistance==0.6.2
entrypoints==0.4
et-xmlfile==1.1.0
executing==2.0.1
fastjsonschema==2.19.1
filelock==3.13.1
Flask==3.0.2
Flask-HTTPAuth==4.8.0
fonttools==4.47.2
fqdn==1.5.1
frozenlist==1.4.1
fsspec==2024.2.0
future==0.18.3
fuzzywuzzy==0.18.0
greenlet==3.0.3
h11==0.14.0
h2o==3.44.0.3
haversine==2.8.1
httpcore==1.0.2
httpx==0.26.0
hyperopt==0.2.7
idna==3.6
importlib_metadata==8.4.0
iniconfig==2.0.0
ipykernel==6.29.2
ipython==8.21.0
ipywidgets==8.1.2
isodate==0.6.1
isoduration==20.11.0
itsdangerous==2.1.2
jaraco.classes==3.4.0
jedi==0.19.1
jeepney==0.8.0
Jinja2==3.1.3
jmespath==1.0.1
joblib==1.3.2
json5==0.9.14
jsonpickle==3.0.2
jsonpointer==2.4
jsonschema==4.21.1
jsonschema-specifications==2023.12.1
jupyter-events==0.9.0
jupyter-lsp==2.2.2
jupyter_client==8.6.0
jupyter_core==5.7.1
jupyter_server==2.12.5
jupyter_server_terminals==0.5.2
jupyterlab==4.1.0
jupyterlab_pygments==0.3.0
jupyterlab_server==2.25.2
jupyterlab_widgets==3.0.10
keyring==24.3.1
kiwisolver==1.4.5
Levenshtein==0.25.0
lightgbm==3.2.1
llvmlite==0.42.0
lxml==5.1.0
MarkupSafe==2.1.5
matplotlib==3.8.2
matplotlib-inline==0.1.6
mistune==3.0.2
more-itertools==10.2.0
msal==1.26.0
msal-extensions==1.1.0
multidict==6.0.5
multiprocess==0.70.16
mypy-extensions==1.0.0
nbclient==0.9.0
nbconvert==7.16.0
nbformat==5.9.2
nest-asyncio==1.6.0
networkx==3.2.1
notebook_shim==0.2.3
numba==0.59.0
numpy==1.26.3
openpyxl==3.1.5
overrides==7.7.0
packaging==23.2
pandas==2.1.4
pandocfilters==1.5.1
papermill==2.5.0
paramiko==3.4.0
parso==0.8.3
passlib==1.7.4
pathlib_mate==1.3.2
pathos==0.3.2
pathspec==0.12.1
patsy==0.5.6
pdfminer.six==20170720
pexpect==4.9.0
pillow==10.2.0
platformdirs==3.11.0
plotly==5.18.0
pluggy==1.4.0
portalocker==2.8.2
pox==0.3.4
ppft==1.7.6.8
prettytable==3.9.0
prometheus-client==0.19.0
prompt-toolkit==3.0.43
psutil==5.9.8
psycopg2-binary==2.9.9
ptyprocess==0.7.0
pure-eval==0.2.2
py4j==0.10.9.7
pyarrow==17.0.0
pycorenlp==0.3.0
pycparser==2.21
pycryptodome==3.20.0
pydantic==2.6.1
pydantic_core==2.16.2
Pygments==2.17.2
PyJWT==2.8.0
pymssql==2.2.11
PyNaCl==1.5.0
pyodbc==5.1.0
pyOpenSSL==23.3.0
pyparsing==3.1.1
pytest==8.0.0
python-dateutil==2.8.2
python-json-logger==2.0.7
python-Levenshtein==0.25.0
pytz==2024.1
PyYAML==6.0.1
pyzmq==25.1.2
rapidfuzz==3.6.1
referencing==0.33.0
requests==2.31.0
requests-cache==1.1.1
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rpds-py==0.17.1
s3fs==2024.2.0
scikit-learn==1.2.2
scipy==1.12.0
seaborn==0.13.2
SecretStorage==3.3.3
Send2Trash==1.8.2
shap==0.44.1
six==1.16.0
slicer==0.0.7
sniffio==1.3.0
snowflake-connector-python==3.7.0
snowflake-sqlalchemy==1.5.1
sortedcontainers==2.4.0
soupsieve==2.5
SQLAlchemy==1.4.51
sqlalchemy-mate==1.4.28.4
sshtunnel==0.4.0
stack-data==0.6.3
statsmodels==0.14.1
tabulate==0.9.0
tenacity==8.2.3
terminado==0.18.0
threadpoolctl==3.2.0
tinycss2==1.2.1
tomlkit==0.12.3
tornado==6.4
tqdm==4.66.1
traitlets==5.14.1
types-python-dateutil==2.8.19.20240106
typing_extensions==4.9.0
tzdata==2023.4
ujson==5.9.0
unicodecsv==0.14.1
uri-template==1.3.0
url-normalize==1.4.3
urllib3==2.0.7
uszipcode==1.0.1
Wand==0.6.13
wcwidth==0.2.13
webcolors==1.13
webencodings==0.5.1
websocket-client==1.7.0
Werkzeug==3.0.1
wget==3.2
widgetsnbextension==4.0.10
wrapt==1.16.0
yarl==1.9.4
zipp==3.20.1

What did you do?

Hey all,

I am using `pandas` to write data to a snowflake database. As far as I can tell my code is set up correctly, however I keep getting an error:

DatabaseError: Execution failed on sql '
        SELECT
            name
        FROM
            sqlite_master
        WHERE
            type IN ('table', 'view')
            AND name=?;
        ': not all arguments converted during string formatting

when I run my code.

Here is my code setup.

from sqlalchemy import create_engine
import pandas as pd

data = ...

snowflake_sql_engine_string = "snowflake://user:password@account/database/schema"
engine = create_engine(snowflake_sql_engine_string)
with engine.connect() as conn:
    data.to_sql(
        name='table',
        schema='schema',
        con=conn.connection,
        if_exists='replace',
        index=None,
    )
pandas version = 2.1.4
sqlalchemy version = 1.4.51

### What did you expect to see?

I was hoping to see no errors and the data load into snowflake.

### Can you set logging to DEBUG and collect the logs?

```bash
TypeError                                 Traceback (most recent call last)
File /usr/local/lib/python3.11/site-packages/pandas/io/sql.py:2262, in SQLiteDatabase.execute(self, sql, params)
   2261 try:
-> 2262     cur.execute(sql, *args)
   2263     return cur

File /usr/local/lib/python3.11/site-packages/snowflake/connector/cursor.py:1000, in SnowflakeCursor.execute(self, command, params, _bind_stage, timeout, _exec_async, _no_retry, _do_reset, _put_callback, _put_azure_callback, _put_callback_output_stream, _get_callback, _get_azure_callback, _get_callback_output_stream, _show_progress_bar, _statement_params, _is_internal, _describe_only, _no_results, _is_put_get, _raise_put_get_error, _force_put_overwrite, _skip_upload_on_content_match, file_stream, num_statements)
    999 if self._connection.is_pyformat:
-> 1000     query = self._preprocess_pyformat_query(command, params)
   1001 else:
   1002     # qmark and numeric paramstyle

File /usr/local/lib/python3.11/site-packages/snowflake/connector/cursor.py:832, in SnowflakeCursor._preprocess_pyformat_query(self, command, params)
    825 if (
    826     self.connection._interpolate_empty_sequences
    827     and processed_params is not None
   (...)
    830     and len(processed_params) > 0
    831 ):
--> 832     query = command % processed_params
    833 else:

TypeError: not all arguments converted during string formatting

The above exception was the direct cause of the following exception:

DatabaseError                             Traceback (most recent call last)
Cell In[16], line 6
      4 engine = create_engine(snowflake_sql_engine_string)
      5 with engine.connect() as conn:
----> 6     cpi_df.to_sql(
      7         name='cpi_data',
      8         schema='cpi_data',
      9         con=conn.connection,
     10         if_exists='replace',
     11         index=None,
     12     )

File /usr/local/lib/python3.11/site-packages/pandas/util/_decorators.py:333, in deprecate_nonkeyword_arguments.<locals>.decorate.<locals>.wrapper(*args, **kwargs)
    327 if len(args) > num_allow_args:
    328     warnings.warn(
    329         msg.format(arguments=_format_argument_list(allow_args)),
    330         FutureWarning,
    331         stacklevel=find_stack_level(),
    332     )
--> 333 return func(*args, **kwargs)

File /usr/local/lib/python3.11/site-packages/pandas/core/generic.py:3008, in NDFrame.to_sql(self, name, con, schema, if_exists, index, index_label, chunksize, dtype, method)
   2813 """
   2814 Write records stored in a DataFrame to a SQL database.
   2815 
   (...)
   3004 [(1,), (None,), (2,)]
   3005 """  # noqa: E501
   3006 from pandas.io import sql
-> 3008 return sql.to_sql(
   3009     self,
   3010     name,
   3011     con,
   3012     schema=schema,
   3013     if_exists=if_exists,
   3014     index=index,
   3015     index_label=index_label,
   3016     chunksize=chunksize,
   3017     dtype=dtype,
   3018     method=method,
   3019 )

File /usr/local/lib/python3.11/site-packages/pandas/io/sql.py:788, in to_sql(frame, name, con, schema, if_exists, index, index_label, chunksize, dtype, method, engine, **engine_kwargs)
    783     raise NotImplementedError(
    784         "'frame' argument should be either a Series or a DataFrame"
    785     )
    787 with pandasSQL_builder(con, schema=schema, need_transaction=True) as pandas_sql:
--> 788     return pandas_sql.to_sql(
    789         frame,
    790         name,
    791         if_exists=if_exists,
    792         index=index,
    793         index_label=index_label,
    794         schema=schema,
    795         chunksize=chunksize,
    796         dtype=dtype,
    797         method=method,
    798         engine=engine,
    799         **engine_kwargs,
    800     )

File /usr/local/lib/python3.11/site-packages/pandas/io/sql.py:2438, in SQLiteDatabase.to_sql(self, frame, name, if_exists, index, index_label, schema, chunksize, dtype, method, engine, **engine_kwargs)
   2427             raise ValueError(f"{col} ({my_type}) not a string")
   2429 table = SQLiteTable(
   2430     name,
   2431     self,
   (...)
   2436     dtype=dtype,
   2437 )
-> 2438 table.create()
   2439 return table.insert(chunksize, method)

File /usr/local/lib/python3.11/site-packages/pandas/io/sql.py:925, in SQLTable.create(self)
    924 def create(self) -> None:
--> 925     if self.exists():
    926         if self.if_exists == "fail":
    927             raise ValueError(f"Table '{self.name}' already exists.")

File /usr/local/lib/python3.11/site-packages/pandas/io/sql.py:911, in SQLTable.exists(self)
    910 def exists(self):
--> 911     return self.pd_sql.has_table(self.name, self.schema)

File /usr/local/lib/python3.11/site-packages/pandas/io/sql.py:2453, in SQLiteDatabase.has_table(self, name, schema)
   2442 wld = "?"
   2443 query = f"""
   2444 SELECT
   2445     name
   (...)
   2450     AND name={wld};
   2451 """
-> 2453 return len(self.execute(query, [name]).fetchall()) > 0

File /usr/local/lib/python3.11/site-packages/pandas/io/sql.py:2274, in SQLiteDatabase.execute(self, sql, params)
   2271     raise ex from inner_exc
   2273 ex = DatabaseError(f"Execution failed on sql '{sql}': {exc}")
-> 2274 raise ex from exc

DatabaseError: Execution failed on sql '
        SELECT
            name
        FROM
            sqlite_master
        WHERE
            type IN ('table', 'view')
            AND name=?;
        ': not all arguments converted during string formatting
sfc-gh-dszmolka commented 2 months ago

hi , thanks for raising this issue. Could you please try setting up the connection + running the query as described in https://docs.snowflake.com/en/developer-guide/python-connector/sqlalchemy and see if it helps you ?

doma-rydevera3 commented 2 months ago

hi , thanks for raising this issue. Could you please try setting up the connection + running the query as described in https://docs.snowflake.com/en/developer-guide/python-connector/sqlalchemy and see if it helps you ?

Hey @sfc-gh-dszmolka I was able to get the copy method to work but I still cannot get the pandas.to_sql method to work. I tried using the string as above and the URL. I can try some additional triaging in the coming days.

sfc-gh-dszmolka commented 2 months ago

hi @doma-rydevera3 glad to hear you made copy work - do you need any further help here ? i see the SQL error message quoted in the initial issue submission references an unbound variable, at least i don't see where you bind the value for ?

let us know please if you need further help, and if possible please specify the exact repro you're having problems with. thank you in advance!

sfc-gh-dszmolka commented 4 weeks ago

it's been a month now - i'm going to close this issue but if any further assistance is required, do comment please and i can reopen.