Snowflake-Labs / schemachange

A Database Change Management tool for Snowflake
Apache License 2.0
481 stars 218 forks source link

Invisible UTF-8 BOM char (`ufeff`) at beginning of script causing error when communicating with Snowflake #250

Open dp-rp opened 2 months ago

dp-rp commented 2 months ago

Describe the bug When trying to run a schemachange script with UTF-8 with BOM encoding, the BOM char causes an SQL compilation error: syntax error line 1 at position 0 unexpected '\ufeff-'

To Reproduce Steps to reproduce the behavior:

  1. Save a change script with UTF-8 with BOM encoding (e.g. you can set the encoding in Visual Studio Code and save the file to add the invisible char)
  2. Try running the script
  3. See error

Expected behavior Schemachange should ignore the zero width no-break space char during SQL compliation.

Alternatively, if UTF-8 (without BOM) encoding is a strict requirement, an error with a message explicitly stating only UTF-8 encoding is supported should be thrown.

Schemachange (please complete the following information):

Additional context

A provisioning application runs in our pipeline that calls schemachange and passes through it's stdout/stderr, here are the logs from our pipeline (with potentially sensitive information redacted):

2024-04-29T05:34:27.6745790Z       SchemaChange command and arguments: schemachange  -f D:\a\1\a/#REDACTED# -a #REDACTED# -u #REDACTED# -r #REDACTED# -w #REDACTED# -d #REDACTED# -c #REDACTED#.#REDACTED#.CHANGE_HISTORY --config-folder D:\a\1\a --create-change-history-table
2024-04-29T05:34:27.6765806Z       Checking env vars used by SchemaChange...
2024-04-29T05:34:27.6791928Z       [ WARNING! ]: Env var '#REDACTED#' hasn't been set - this may cause templating issues!
2024-04-29T05:34:27.6809225Z       Starting SchemaChange process...
2024-04-29T05:34:34.6945752Z       schemachange version: 3.6.1
2024-04-29T05:34:34.6954833Z Using config file: D:\a\1\a\schemachange-config.yml
2024-04-29T05:34:34.6965562Z Using root folder D:\a\1\a\#REDACTED#
2024-04-29T05:34:34.6974900Z Using variables:
2024-04-29T05:34:34.6984616Z   #REDACTED#
2024-04-29T05:34:34.7122057Z 
2024-04-29T05:34:34.7131787Z Using Snowflake account #REDACTED#
2024-04-29T05:34:34.7140633Z Using default role #REDACTED#
2024-04-29T05:34:34.7150643Z Using default warehouse #REDACTED#
2024-04-29T05:34:34.7160010Z Using default database #REDACTED#schema None
2024-04-29T05:34:34.7175506Z Using change history table #REDACTED#.#REDACTED#.CHANGE_HISTORY (last altered 2024-04-11 23:03:45.095000-07:00)
2024-04-29T05:34:34.7184072Z Max applied change script version: 2.0.5
2024-04-29T05:34:34.7193314Z Applying change script V2.0.6__fix_item_views_to_latest.sql
2024-04-29T05:34:34.7198819Z 
2024-04-29T05:34:34.7207798Z fail: #REDACTED#[0]
2024-04-29T05:34:34.7217115Z       Traceback (most recent call last):
2024-04-29T05:34:34.7226246Z   File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\runpy.py", line 197, in _run_module_as_main
2024-04-29T05:34:34.7235446Z     return _run_code(code, main_globals, None,
2024-04-29T05:34:34.7246241Z   File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\runpy.py", line 87, in _run_code
2024-04-29T05:34:34.7255130Z     exec(code, run_globals)
2024-04-29T05:34:34.7265857Z   File "C:\hostedtoolcache\windows\Python\3.9.13\x64\Scripts\schemachange.exe\__main__.py", line 7, in <module>
2024-04-29T05:34:34.7273977Z   File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\schemachange\cli.py", line 896, in main
2024-04-29T05:34:34.7282715Z     deploy_command(config)
2024-04-29T05:34:34.7292136Z   File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\schemachange\cli.py", line 577, in deploy_command
2024-04-29T05:34:34.7301169Z     session.apply_change_script(script, content, change_history_table)
2024-04-29T05:34:34.7310517Z   File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\schemachange\cli.py", line 462, in apply_change_script
2024-04-29T05:34:34.7319558Z     self.execute_snowflake_query(script_content)
2024-04-29T05:34:34.7336931Z   File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\schemachange\cli.py", line 367, in execute_snowflake_query
2024-04-29T05:34:34.7346539Z     raise e
2024-04-29T05:34:34.7355724Z   File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\schemachange\cli.py", line 360, in execute_snowflake_query
2024-04-29T05:34:34.7364960Z     res = self.con.execute_string(query)
2024-04-29T05:34:34.7374077Z   File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\snowflake\connector\connection.py", line 861, in execute_string
2024-04-29T05:34:34.7383196Z     ret = list(stream_generator)
2024-04-29T05:34:34.7392363Z   File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\snowflake\connector\connection.py", line 879, in execute_stream
2024-04-29T05:34:34.7401523Z     cur.execute(sql, _is_put_get=is_put_or_get, **kwargs)
2024-04-29T05:34:34.7410897Z   File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\snowflake\connector\cursor.py", line 1080, in execute
2024-04-29T05:34:34.7419680Z     Error.errorhandler_wrapper(self.connection, self, error_class, errvalue)
2024-04-29T05:34:34.7429705Z   File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\snowflake\connector\errors.py", line 290, in errorhandler_wrapper
2024-04-29T05:34:34.7438247Z     handed_over = Error.hand_to_other_handler(
2024-04-29T05:34:34.7448153Z   File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\snowflake\connector\errors.py", line 345, in hand_to_other_handler
2024-04-29T05:34:34.7457166Z     cursor.errorhandler(connection, cursor, error_class, error_value)
2024-04-29T05:34:34.7466966Z   File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\snowflake\connector\errors.py", line 221, in default_errorhandler
2024-04-29T05:34:34.7475864Z     raise error_class(
2024-04-29T05:34:34.7484951Z snowflake.connector.errors.ProgrammingError: 001003 (42000): SQL compilation error:
2024-04-29T05:34:34.7495847Z syntax error line 1 at position 0 unexpected '\ufeff-'.
2024-04-29T05:34:34.7501343Z 
2024-04-29T05:34:34.7528544Z fail: #REDACTED#[0]
2024-04-29T05:34:34.7537519Z       SchemaChange failed with exit code 1.
2024-04-29T05:34:34.8940991Z fail: #REDACTED#[0]
2024-04-29T05:34:34.8950908Z       Failed to provision the tenant database.
sfc-gh-tmathew commented 1 month ago

Thank you for reporting the issue. We are relying on the snowflake-python-connector instead of adding additional checks for various encodings. Will table this for in a future release.