Snowflake-Labs / schemachange

A Database Change Management tool for Snowflake
Apache License 2.0
502 stars 224 forks source link

Getting "unexpected '<EOF>'" error when trying to deploy migrations on gitlab #212

Open carley-citian opened 10 months ago

carley-citian commented 10 months ago

Describe the bug Hi! I am getting the following error everytime I try to run my Gitlab CI/CD pipeline with schemachange on a new SQL migrations file:

Traceback (most recent call last):
  File "/usr/local/bin/schemachange", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.9/site-packages/schemachange/cli.py", line 896, in main
    deploy_command(config)
  File "/usr/local/lib/python3.9/site-packages/schemachange/cli.py", line 577, in deploy_command
    session.apply_change_script(script, content, change_history_table)
  File "/usr/local/lib/python3.9/site-packages/schemachange/cli.py", line 462, in apply_change_script
    self.execute_snowflake_query(script_content)
  File "/usr/local/lib/python3.9/site-packages/schemachange/cli.py", line 367, in execute_snowflake_query
    raise e
  File "/usr/local/lib/python3.9/site-packages/schemachange/cli.py", line 360, in execute_snowflake_query
    res = self.con.execute_string(query)
  File "/usr/local/lib/python3.9/site-packages/snowflake/connector/connection.py", line 792, in execute_string
    ret = list(stream_generator)
  File "/usr/local/lib/python3.9/site-packages/snowflake/connector/connection.py", line 810, in execute_stream
    cur.execute(sql, _is_put_get=is_put_or_get, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/snowflake/connector/cursor.py", line 920, in execute
    Error.errorhandler_wrapper(self.connection, self, error_class, errvalue)
  File "/usr/local/lib/python3.9/site-packages/snowflake/connector/errors.py", line 290, in errorhandler_wrapper
    handed_over = Error.hand_to_other_handler(
  File "/usr/local/lib/python3.9/site-packages/snowflake/connector/errors.py", line 345, in hand_to_other_handler
    cursor.errorhandler(connection, cursor, error_class, error_value)
  File "/usr/local/lib/python3.9/site-packages/snowflake/connector/errors.py", line 221, in default_errorhandler
    raise error_class(
snowflake.connector.errors.ProgrammingError: 001003 (42000): SQL compilation error:
syntax error line 9 at position 17 unexpected '<EOF>'.

I have tested the same exact code in a worksheet on Snowflake and can confirm it works. I also have run migration files that worked on my pipeline before, but recently this error is popping up for every file I try to run on the pipeline.

Here are the final few lines of my file for context -- there should be no issue with the code:

-- ops_admin
GRANT ALL ON ALL TABLES IN SCHEMA skeleton_basedata TO role ops_admin;
GRANT all ON ALL SEQUENCES IN SCHEMA skeleton_basedata TO role ops_admin;
GRANT all on FUNCTION skeleton_basedata.st_union_agg(ARRAY) TO role ops_admin;
GRANT all ON PROCEDURE skeleton_basedata.get_nearby_basedata(double precision, double precision, double precision) TO role ops_admin;
GRANT all ON ALL DYNAMIC TABLES IN SCHEMA skeleton_basedata TO role ops_admin;
GRANT all ON ALL VIEWS IN SCHEMA skeleton_basedata TO role ops_admin;

To Reproduce Steps to reproduce the behavior:

  1. Commit the above sql code as a migration
  2. Run pipeline
  3. Scroll down to log where schemachange is attempting to execute code
  4. See error

Expected behavior Expected behavior would be that the code changes are run on snowflake this step, and committed.

Schemachange (please complete the following information):

Additional context Here is the query tag given to my run: SQL query: ALTER SESSION SET QUERY_TAG = 'schemachange 3.6.1;V1.0.3__skeleton_basedata_schema.sql'

Also here is a screenshot of the error I am getting in the log, seems like the trailing semicolon is getting cut off for some reason if that is helpful?: Screenshot 2023-11-17 at 5 26 19 PM

I have been committing the changes manually on Snowflake worksheets for now, but would love to get my pipeline up and working again soon if anyone can help!

sfc-gh-tmathew commented 10 months ago

@carley-citian

Could you try the following and confirm?

  1. Does your pipeline run using schemachange 3.6.0?
  2. What is the version of snowflake-python-connector ?
  3. Also, Can you confirm if the LF and CRLF end of line characters are passed into the CI/CD container correctly?
carley-citian commented 10 months ago

HI @sfc-gh-tmathew !

  1. Yes, it runs using schemachange version 3.6.1
  2. It uses version 3.5.0 of snowflake-python-connector, which is installed as a dependency of schemachange
  3. I am not quite sure what you are asking here. Where would I find this?

If it's helpful I am using Gitlab CI/CD and am using the Python 3.9.18 docker image: https://hub.docker.com/layers/library/python/3.9.18/images/sha256-753586e289a94965eb389ae5397233d32c3cff0f40f81c633dcc00d277012634?context=explore

And here is the script I am using for this pipeline in my .gitlab-ci.yml file:

# install and run schemachange on changes to migrations
  script:
    - echo "Checking python version"
    - python --version
    - echo "Installing schemachange"
    - pip install schemachange --progress-bar off
    - echo "Running schemachange"
    - ls $CI_PROJECT_DIR/migrations/
    - schemachange -f $CI_PROJECT_DIR/migrations/ -a $SF_ACCOUNT -u $SF_USERNAME -r $SF_ROLE -w $SF_WAREHOUSE -d $SF_DATABASE -c $SF_DATABASE.SCHEMACHANGE.CHANGE_HISTORY --create-change-history-table --verbose
sfc-gh-tmathew commented 10 months ago

Thank you @carley-citian for responding.

Regarding the schemachange version, you reported the issue in version 3.6.1. I was asking if you could run the pipelines using the previous release of schemachange (say 3.6.0, 3.5.4 etc). 3.6.1 is the latest release. 3.6.0 was the previous release. Trying to rule out the issue between 3.6.0 and 3.6.1. Could you run your pipeline on a prior version of schemachange?

How are you pushing your changes to Gitlab Project ? through the gitlab web ui or pushed locally through an IDE like Visual Studio Code or some text editor. If using text editor, you can tell whether the text file is using LF (line feed aka unix style new line character) or CRLF(Carriage Return - Line Feed aka Windows style new line character. Also, as a test, could you test with an additional new line after the last line in your SQL and report back the results of your test?

Please share your test results to help narrow down the root cause.

MACKAT05 commented 9 months ago

Another thing that you could check is the exact failure as recorded in the query history.

marczellick-belonghealth commented 5 months ago

I'm experiencing the same issue. Was on schemachange 3.6.0 for a while without any problem, and now this has popped up, without any changes to my venv. Problem is the same on 3.6.0 and also 3.6.1 after upgrading. Tried with snowflake-connector-python versions 3.5.0 and also when upgraded to 3.8.1.

Each line ends with CR/LF.

Several commands in the script work fine, then it fails when trying to create a SQL Language Stored Procedure that starts with a TRUNCATE TABLE statement. It is failing on the semicolon that terminates that statement. I double-checked by moving the semicolon to a separate line, same issue. The create procedure works fine in other contexts, only fails in SCHEMACHANGE.

This is an extremely common code pattern for deployment, I have a standard template for these statements and stored procedures, and they have all deployed fine in the past.

Typically when I've seen errors like this in Schemachange, it's because the entire deployment script ends in a comment, but that's not the case here. I removed all comments from the script and the problem persists.

sfc-gh-tmathew commented 4 months ago

I am unable to reproduce the error. @carley-citian Could you share the gitlab yaml file ?

Can you pin the schemachange version to the last working version in your runner aka to a time before you started facing this error ?

sfc-gh-tmathew commented 4 months ago

The error can be associated with a few scenarios but I have not been able to reproduce it. But found this blog talking about EOF errors related to snowflake but not schemachange. https://www.getcensus.com/blog/how-to-solve-the-unexpected-eof-syntax-error-in-snowflake

https://www.getcensus.com/blog/how-to-solve-the-unexpected-eof-syntax-error-in-snowflake

According to the snowflake python connector docs has method execute_string and execute_stream that have the "remove_comments" parameter. In order to address the comment in the last line, we will have the ability to remove comments from the SQL statement.

Will discuss and see when we can include this.

marczellick-belonghealth commented 4 months ago

I found a cause and a workaround. When creating a SQL Stored Procedure, the outer BEGIN/END commands must be encapsulated. This can be done with $$ or with single quotes (but then all single quotes in the procedure itself have to be doubled).

I had found that in VSCode using the Snowflake plugin, I no longer needed this encapsulation to successfully execute a CREATE PROCEDURE script. But the same script does require it when deploying via schemachange. This is unfortunate, using single quotes to encapsulate means you must modify the code itself. Using $$ is better, but it throws off the syntax coloring in VSCode.

sfc-gh-tmathew commented 4 months ago

Thank you @marczellick-belonghealth for identifying the root cause. Regarding formatting of the stored proc in Visual Studio Code, the latest version of snowflake extension does seem to format the stored procedure code. See screenshot here.

image