aws / amazon-redshift-python-driver

Redshift Python Connector. It supports Python Database API Specification v2.0.
Apache License 2.0
203 stars 75 forks source link

Problem running statement with bind params #194

Closed Riqardos closed 7 months ago

Riqardos commented 11 months ago

Driver version

Redshift version

PostgreSQL 8.0.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.4.2 20041017 (Red Hat 3.4.2-6.fc3), Redshift 1.0.58814\x00

Client Operating System

MacOS 14.1 (23B74)

Python version

3.8

Table schema

Whatever statement with bind_params

Problem description

Have problem running SQL statement with bind_params according this docs

E.g. this statement

conn = redshift_connector.connect(...)
cursor = conn.cursor()
cursor.execute('CREATE SCHEMA IF NOT EXISTS %s;',("example")) 

results in

KeyError                                  Traceback (most recent call last)
File [~/.local/share/virtualenvs/etl_jobs-NiSdy2yR/lib/python3.8/site-packages/redshift_connector/core.py:1767](https://file+.vscode-resource.vscode-cdn.netriqardos/etljobs/redshift2redshift_job/~/.local/share/virtualenvs/etl_jobs-NiSdy2yR/lib/python3.8/site-packages/redshift_connector/core.py:1767), in Connection.execute(self, cursor, operation, vals)
   1766 try:
-> 1767     ps = cache["ps"][key]
   1768     _logger.debug("Using cached prepared statement")

KeyError: ('CREATE SCHEMA IF NOT EXISTS %s;', ((<RedshiftOID.UNKNOWN: 705>, 0, <function text_out at 0x12f0e4940>), (<RedshiftOID.UNKNOWN: 705>, 0, <function text_out at 0x12f0e4940>), (<RedshiftOID.UNKNOWN: 705>, 0, <function text_out at 0x12f0e4940>), (<RedshiftOID.UNKNOWN: 705>, 0, <function text_out at 0x12f0e4940>)))

During handling of the above exception, another exception occurred:
...
ProgrammingError: {'S': 'ERROR', 'C': '42601', 'M': 'syntax error at or near "$1"', 'P': '29', 'F': '/home/ec2-user/padb/src/pg/src/backend/parser/parser_scan.l', 'L': '840', 'R': 'yyerror'}

but running it like this works

conn = redshift_connector.connect(...)
conn.autocommit = True
cursor = conn.cursor()
cursor.execute(f'CREATE SCHEMA IF NOT EXISTS {"example"};') 

Python Driver trace logs

KeyError                                  Traceback (most recent call last)
File [~/.local/share/virtualenvs/etl_jobs-NiSdy2yR/lib/python3.8/site-packages/redshift_connector/core.py:1767](https://file+.vscode-resource.vscode-cdn.net/Users/riqardos/etl/jobs/redshift2redshift_job/~/.local/share/virtualenvs/etl_jobs-NiSdy2yR/lib/python3.8/site-packages/redshift_connector/core.py:1767), in Connection.execute(self, cursor, operation, vals)
   1766 try:
-> 1767     ps = cache["ps"][key]
   1768     _logger.debug("Using cached prepared statement")

KeyError: ('CREATE SCHEMA IF NOT EXISTS %s;', ((<RedshiftOID.UNKNOWN: 705>, 0, <function text_out at 0x12f0e4940>), (<RedshiftOID.UNKNOWN: 705>, 0, <function text_out at 0x12f0e4940>), (<RedshiftOID.UNKNOWN: 705>, 0, <function text_out at 0x12f0e4940>), (<RedshiftOID.UNKNOWN: 705>, 0, <function text_out at 0x12f0e4940>), (<RedshiftOID.UNKNOWN: 705>, 0, <function text_out at 0x12f0e4940>), (<RedshiftOID.UNKNOWN: 705>, 0, <function text_out at 0x12f0e4940>), (<RedshiftOID.UNKNOWN: 705>, 0, <function text_out at 0x12f0e4940>)))

During handling of the above exception, another exception occurred:

ProgrammingError                          Traceback (most recent call last)
[/Users/riqardos/etl/jobs/redshift2redshift_job/main.ipynb](https://file+.vscode-resource.vscode-cdn.net/Users/riqardos/etl//jobs/redshift2redshift_job/main.ipynb) Cell 2 line 8
     [81](vscode-notebook-cell:/Users/riqardos/etl/jobs/redshift2redshift_job/main.ipynb#W1sZmlsZQ%3D%3D?line=80) logging.basicConfig(level=logging.INFO)
     [83](vscode-notebook-cell:/Users/riqardos/etl/jobs/redshift2redshift_job/main.ipynb#W1sZmlsZQ%3D%3D?line=82) redshift_etl = RedshiftETL()
---> [85](vscode-notebook-cell:/Users/riqardos/etl/jobs/redshift2redshift_job/main.ipynb#W1sZmlsZQ%3D%3D?line=84) redshift_etl.cursor.execute(f'CREATE SCHEMA IF NOT EXISTS %s;',("example"))
     [87](vscode-notebook-cell:/Users/riqardos/etl/jobs/redshift2redshift_job/main.ipynb#W1sZmlsZQ%3D%3D?line=86) # conn = redshift_connector.connect(...)
     [88](vscode-notebook-cell:/Users/riqardos/etl/jobs/redshift2redshift_job/main.ipynb#W1sZmlsZQ%3D%3D?line=87) # cursor = conn.cursor()
     [89](vscode-notebook-cell:/Users/riqardos/etl/jobs/redshift2redshift_job/main.ipynb#W1sZmlsZQ%3D%3D?line=88) # cursor.execute('CREATE SCHEMA IF NOT EXISTS %s;',("example")) 
   (...)
     [99](vscode-notebook-cell:/Users/riqardos/etl/jobs/redshift2redshift_job/main.ipynb#W1sZmlsZQ%3D%3D?line=98) #     append=append
    [100](vscode-notebook-cell:/Users/riqardos/etl/jobs/redshift2redshift_job/main.ipynb#W1sZmlsZQ%3D%3D?line=99) # )

File [~/.local/share/virtualenvs/etl_jobs-NiSdy2yR/lib/python3.8/site-packages/redshift_connector/cursor.py:248](https://file+.vscode-resource.vscode-cdn.net/Users/riqardos/etl/jobs/redshift2redshift_job/~/.local/share/virtualenvs/etl_jobs-NiSdy2yR/lib/python3.8/site-packages/redshift_connector/cursor.py:248), in Cursor.execute(self, operation, args, stream, merge_socket_read)
    246     except:
...
   2137     self.message_types[code](self._read(data_len - 4), cursor)
   2139 if self.error is not None:
-> 2140     raise self.error

ProgrammingError: {'S': 'ERROR', 'C': '42601', 'M': 'syntax error at or near "$1"', 'P': '29', 'F': '/home/ec2-user/padb/src/pg/src/backend/parser/parser_scan.l', 'L': '840', 'R': 'yyerror'}

Reproduction code

import redshift_connector
conn = redshift_connector.connect(...)
conn.autocommit = True
cursor = conn.cursor()
cursor.execute('CREATE SCHEMA IF NOT EXISTS %s;',("example")) 
Brooke-white commented 11 months ago

Hi @Riqardos,

thanks for reaching out. we have some documentation improvements coming in our next release that address this. there's a slight modification needed to your code

please change cursor.execute('CREATE SCHEMA IF NOT EXISTS %s;',("example")) to this cursor.execute('CREATE SCHEMA IF NOT EXISTS %s;',("example",))

note the trailing comma -- the second argument passed to execute needs to be either a tuple/list or dictionary based on which paramstyle is used.

Riqardos commented 11 months ago

Hi @Brooke-white

thanks for reply, I tried to add the trailing comma as you mentioned, but still getting the same issue :/

I have installed redshift-connector 2.0.915

Brooke-white commented 11 months ago

Hey @Riqardos , I am able to reproduce this on my end. The root cause is that Redshift server does not support use of bind parameters for CREATE SCHEMA statements.

The best suggestion I can offer at this time is to use the approach you found to work above as well as ensuring the variable storing the name of the schema comes from a trusted source/has been sanitized, as this approach is vulnerable to SQL injection. Please refer to best practices for avoiding SQL injection for more specific recommendations.

In parallel, I will work to open a feature request for bind parameter support for CREATE SCHEMA with the Redshift server team and provide tracking information for that request in this GitHub issue.

Brooke-white commented 11 months ago

reference: RedshiftDP-56518

Brooke-white commented 7 months ago

Hi folks, to request prioritization for this issue I recommend to reach out to AWS Support referencing RedshiftDP-56518