snowflakedb / snowpark-python

Snowflake Snowpark Python API
Apache License 2.0
269 stars 111 forks source link

SNOW-1565789: `Column.isin()` fails with empty list #1999

Open NickFishr opened 3 months ago

NickFishr commented 3 months ago
  1. What version of Python are you using?

    Python 3.11.0rc1 (main, Aug 12 2022, 10:02:14) [GCC 11.2.0]

  2. What operating system and processor architecture are you using?

    • Linux-5.19.0-46-generic-x86_64-with-glibc2.35
    • Web Console
  3. What are the component versions in the environment (pip freeze)?

    ...
    pandas==2.2.2
    pyarrow==17.0.0
    pytest==7.4.4
    requests==2.32.3
    snowflake-connector-python==3.11.0
    snowflake-snowpark-python==1.20.0
    ...
  4. What did you do?

    Create a dataframe and a apply filter() with an isin() subquery (whose result was empty).

    import snowflake.snowpark as snowpark
    
    def main(session: snowpark.Session): 
       df1 = session.create_dataframe([1, 2, 3, 4]).to_df("a")
       df2 = session.create_dataframe([1, 2, 3, 4]).to_df("b")
    
       # Lazy evaluation of a subquery with empty result set --> OK
       df3 = df1.filter(df1["a"].isin(df2[df2["b"] > 4]))
       df3.collect()
    
       # None --> OK
       df4 = df1.filter(df1["a"].isin(None))
       df4.collect()
    
       # Empty list --> EXCEPTION
       df5 = df1.filter(df1["a"].isin([]))
       df5.collect()
    
       # Collecting the subquery which yields an empty result is essentially the same
       #df5 = df1.filter(df1["a"].isin((df2[df2["b"] > 4]).collect()))
       #df5.collect()

    This problem does not occurr in local testing mode.

  5. What did you expect to see?

    An empty dataframe (since the intersection of any set with an empty set is again the empty set).

  6. Can you set logging to DEBUG and collect the logs?

    This is the output from the Python worksheet run in the web console (without the additional debug logs):

  File "/usr/lib/python_udf/ba358c1eee732decf7355676d6091c8871aded551c68484cde9ee9ddcf4e5260/lib/python3.11/site-packages/snowflake/snowpark/_internal/server_connection.py", line 123, in wrap
    raise ex
  File "/usr/lib/python_udf/ba358c1eee732decf7355676d6091c8871aded551c68484cde9ee9ddcf4e5260/lib/python3.11/site-packages/snowflake/snowpark/_internal/server_connection.py", line 117, in wrap
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python_udf/ba358c1eee732decf7355676d6091c8871aded551c68484cde9ee9ddcf4e5260/lib/python3.11/site-packages/snowflake/snowpark/_internal/server_connection.py", line 417, in run_query
    raise ex
  File "/usr/lib/python_udf/ba358c1eee732decf7355676d6091c8871aded551c68484cde9ee9ddcf4e5260/lib/python3.11/site-packages/snowflake/snowpark/_internal/server_connection.py", line 402, in run_query
    results_cursor = self.execute_and_notify_query_listener(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python_udf/ba358c1eee732decf7355676d6091c8871aded551c68484cde9ee9ddcf4e5260/lib/python3.11/site-packages/snowflake/snowpark/_internal/server_connection.py", line 354, in execute_and_notify_query_listener
    results_cursor = self._cursor.execute(query, **kwargs)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python_udf/ba358c1eee732decf7355676d6091c8871aded551c68484cde9ee9ddcf4e5260/lib/python3.11/site-packages/snowflake/connector/cursor.py", line 1016, in execute
    Error.errorhandler_wrapper(
  File "/usr/lib/python_udf/ba358c1eee732decf7355676d6091c8871aded551c68484cde9ee9ddcf4e5260/lib/python3.11/site-packages/snowflake/connector/errors.py", line 232, in errorhandler_wrapper
    handed_over = Error.hand_to_other_handler(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python_udf/ba358c1eee732decf7355676d6091c8871aded551c68484cde9ee9ddcf4e5260/lib/python3.11/site-packages/snowflake/connector/errors.py", line 287, in hand_to_other_handler
    cursor.errorhandler(connection, cursor, error_class, error_value)
  File "/usr/lib/python_udf/ba358c1eee732decf7355676d6091c8871aded551c68484cde9ee9ddcf4e5260/lib/python3.11/site-packages/snowflake/connector/errors.py", line 165, in default_errorhandler
    raise error_class(
snowflake.snowpark.exceptions.SnowparkSQLException: (1304): <REDACTED>: 001003 (42000): SQL compilation error:
syntax error line 1 at position 138 unexpected ')'.

The failing generated SQL query was:

SELECT  *  FROM ( SELECT "_1" AS "A" FROM ( SELECT $1 AS "_1" FROM  VALUES (1 :: INT), (2 :: INT), (3 :: INT), (4 :: INT))) WHERE "A" IN () LIMIT 10

With the empty brackets after the IN at the end being the problem.

sfc-gh-sghosh commented 3 months ago

Hello @NickFishr ,

Thanks for raising the issue, we confirm the issue is with regular session and its throwing exception will work on eliminating it.

SnowparkSQLException: (1304): 01b60cf5-080a-ecc4-0000-164925ab06da: 001003 (42000): SQL compilation error: syntax error line 1 at position 138 unexpected ')'.

Regards, Sujan