snowflakedb / snowpark-python

Snowflake Snowpark Python API
Apache License 2.0
269 stars 111 forks source link

SNOW-1665955: Semicolon breaks DataFrame.count() #2299

Open Tim-Kracht opened 1 month ago

Tim-Kracht commented 1 month ago

Please answer these questions before submitting your issue. Thanks!

  1. What version of Python are you using?

Python 3.11.10 (main, Sep 7 2024, 01:03:31) [Clang 15.0.0 (clang-1500.3.9.4)]

  1. What operating system and processor architecture are you using?

macOS-14.6.1-arm64-arm-64bit

  1. What are the component versions in the environment (pip freeze)?

asn1crypto==1.5.1 certifi==2024.8.30 cffi==1.17.1 charset-normalizer==3.3.2 cloudpickle==2.2.1 cryptography==43.0.1 filelock==3.16.0 idna==3.8 packaging==24.1 platformdirs==4.3.2 pycparser==2.22 PyJWT==2.9.0 pyOpenSSL==24.2.1 python-dotenv==1.0.1 pytz==2024.2 PyYAML==6.0.2 requests==2.32.3 snowflake-connector-python==3.12.1 snowflake-snowpark-python==1.21.1 sortedcontainers==2.4.0 tomlkit==0.13.2 typing_extensions==4.12.2 urllib3==2.2.2

  1. What did you do?

Called Session.sql() for a query that ends with a ; then called DataFrame.collect() and DataFrame.count()

  1. What did you expect to see?

I expected to see the same behavior from a SQL compilation perspective. DataFrame.collect() worked as expected, DataFrame.count() raised a SQL compilation error The same query without the semicolon runs fine in both cases.

from dotenv import load_dotenv
import logging
import os
import snowflake.snowpark as sp

def main():
    load_dotenv()
    args = {
        "account": os.getenv("SNOWFLAKE_ACCOUNT"),
        "user": os.getenv("SNOWFLAKE_USER"),
        "authenticator": "externalbrowser",
    }
    queries = ("select query_start_time from snowflake.account_usage.access_history limit 1", "select query_start_time from snowflake.account_usage.access_history limit 1;")
    session = sp.Session.builder.configs(args).create()

    for query in queries:
        print()
        df = None
        result = None
        count = None

        print(f"{query=}")
        df = session.sql(query=query)

        try:
            result = df.collect()
            print(f"{result=}")
        except Exception as ex:
            print(f"{ex=}")

        try:
            count = df.count()
            print(f"{count=}")
        except Exception as ex:
            print(f"{ex=}")

if __name__ == "__main__":
    for logger_name in ('snowflake.snowpark', 'snowflake.connector'):
       logger = logging.getLogger(logger_name)
       logger.setLevel(logging.DEBUG)
       ch = logging.StreamHandler()
       ch.setLevel(logging.DEBUG)
       ch.setFormatter(logging.Formatter('%(asctime)s - %(threadName)s %(filename)s:%(lineno)d - %(funcName)s() - %(levelname)s - %(message)s'))
       logger.addHandler(ch)
    main()
  1. Can you set logging to DEBUG and collect the logs?
query='select query_start_time from snowflake.account_usage.access_history limit 1'
2024-09-16 15:56:22,571 - MainThread cursor.py:916 - execute() - DEBUG - executing SQL/command
2024-09-16 15:56:22,571 - MainThread cursor.py:931 - execute() - DEBUG - query: [select query_start_time from snowflake.account_usage.access_history limit 1]
2024-09-16 15:56:22,571 - MainThread connection.py:1651 - _next_sequence_counter() - DEBUG - sequence counter: 1
2024-09-16 15:56:22,571 - MainThread cursor.py:641 - _execute_helper() - DEBUG - Request id: x
2024-09-16 15:56:22,571 - MainThread cursor.py:643 - _execute_helper() - DEBUG - running query [select query_start_time from snowflake.account_usage.access_history limit 1]
2024-09-16 15:56:22,571 - MainThread cursor.py:650 - _execute_helper() - DEBUG - is_file_transfer: True
2024-09-16 15:56:22,571 - MainThread connection.py:1312 - cmd_query() - DEBUG - _cmd_query
2024-09-16 15:56:22,571 - MainThread _query_context_cache.py:155 - serialize_to_dict() - DEBUG - serialize_to_dict() called
2024-09-16 15:56:22,571 - MainThread connection.py:1341 - cmd_query() - DEBUG - sql=[select query_start_time from snowflake.account_usage.access_history limit 1], sequence_id=[1], is_file_transfer=[False]
2024-09-16 15:56:22,572 - MainThread network.py:487 - request() - DEBUG - Opentelemtry otel injection failed because of: No module named 'opentelemetry'
2024-09-16 15:56:22,572 - MainThread network.py:1187 - _use_requests_session() - DEBUG - Session status for SessionPool 'x.snowflakecomputing.com', SessionPool 1/1 active sessions
2024-09-16 15:56:22,572 - MainThread network.py:886 - _request_exec_wrapper() - DEBUG - remaining request timeout: N/A ms, retry cnt: 1
2024-09-16 15:56:22,572 - MainThread network.py:868 - add_request_guid() - DEBUG - Request guid: x
2024-09-16 15:56:22,572 - MainThread network.py:1046 - _request_exec() - DEBUG - socket timeout: 60
2024-09-16 15:56:25,086 - MainThread connectionpool.py:474 - _make_request() - DEBUG - https://x.snowflakecomputing.com:443 "POST /queries/v1/query-request?requestId=x&request_guid=x HTTP/1.1" 200 None
2024-09-16 15:56:25,087 - MainThread network.py:1073 - _request_exec() - DEBUG - SUCCESS
2024-09-16 15:56:25,088 - MainThread network.py:1192 - _use_requests_session() - DEBUG - Session status for SessionPool 'x.snowflakecomputing.com', SessionPool 0/1 active sessions
2024-09-16 15:56:25,088 - MainThread network.py:750 - _post_request() - DEBUG - ret[code] = None, after post request
2024-09-16 15:56:25,088 - MainThread network.py:776 - _post_request() - DEBUG - Query id: x
2024-09-16 15:56:25,088 - MainThread _query_context_cache.py:191 - deserialize_json_dict() - DEBUG - deserialize_json_dict() called: data from server: {'entries': [{'id': 0, 'timestamp': 1726516585060215, 'priority': 0}]}
2024-09-16 15:56:25,088 - MainThread _query_context_cache.py:232 - deserialize_json_dict() - DEBUG - deserialize {'id': 0, 'timestamp': 1726516585060215, 'priority': 0}
2024-09-16 15:56:25,088 - MainThread _query_context_cache.py:101 - _sync_priority_map() - DEBUG - sync_priority_map called priority_map size = 0, new_priority_map size = 1
2024-09-16 15:56:25,088 - MainThread _query_context_cache.py:127 - trim_cache() - DEBUG - trim_cache() called. treeSet size is 1 and cache capacity is 5
2024-09-16 15:56:25,088 - MainThread _query_context_cache.py:136 - trim_cache() - DEBUG - trim_cache() returns. treeSet size is 1 and cache capacity is 5
2024-09-16 15:56:25,088 - MainThread _query_context_cache.py:271 - deserialize_json_dict() - DEBUG - deserialize_json_dict() returns
2024-09-16 15:56:25,088 - MainThread _query_context_cache.py:276 - log_cache_entries() - DEBUG - Cache Entry: (0, 1726516585060215, 0)
2024-09-16 15:56:25,088 - MainThread cursor.py:990 - execute() - DEBUG - sfqid: x
2024-09-16 15:56:25,088 - MainThread cursor.py:996 - execute() - DEBUG - query execution done
2024-09-16 15:56:25,088 - MainThread cursor.py:1010 - execute() - DEBUG - SUCCESS
2024-09-16 15:56:25,088 - MainThread cursor.py:1029 - execute() - DEBUG - PUT OR GET: False
2024-09-16 15:56:25,088 - MainThread cursor.py:1142 - _init_result_and_meta() - DEBUG - Query result format: arrow
2024-09-16 15:56:25,089 - MainThread cursor.py:1156 - _init_result_and_meta() - INFO - Number of results in first chunk: 1
2024-09-16 15:56:25,089 - MainThread server_connection.py:421 - run_query() - DEBUG - Execute query [queryID: x] select query_start_time from snowflake.account_usage.access_history limit 1
2024-09-16 15:56:25,089 - MainThread result_batch.py:68 - _create_nanoarrow_iterator() - DEBUG - Using nanoarrow as the arrow data converter
2024-09-16 15:56:25,089 - MainThread CArrowIterator.cpp:120 - CArrowIterator() - DEBUG - Arrow BatchSize: 1
2024-09-16 15:56:25,089 - MainThread CArrowChunkIterator.cpp:46 - CArrowChunkIterator() - DEBUG - Arrow chunk info: batchCount 1, columnCount 1, use_numpy: 0
2024-09-16 15:56:25,089 - MainThread nanoarrow_arrow_iterator.cpython-311-darwin.so:0 - __cinit__() - DEBUG - Batches read: 0
2024-09-16 15:56:25,089 - MainThread result_set.py:87 - result_set_iterator() - DEBUG - beginning to schedule result batch downloads
2024-09-16 15:56:25,089 - MainThread CArrowChunkIterator.cpp:70 - next() - DEBUG - Current batch index: 0, rows in current batch: 1
result=[Row(QUERY_START_TIME=datetime.datetime(2023, 9, 1, 6, 27, 7, 973000, tzinfo=<DstTzInfo 'America/Los_Angeles' PDT-1 day, 17:00:00 DST>))]
2024-09-16 15:56:26,071 - MainThread cursor.py:916 - execute() - DEBUG - executing SQL/command
2024-09-16 15:56:26,071 - MainThread cursor.py:931 - execute() - DEBUG - query: [SELECT count(1) AS "COUNT(LITERAL())" FROM (select query_start_time from snowfla...]
2024-09-16 15:56:26,071 - MainThread connection.py:1651 - _next_sequence_counter() - DEBUG - sequence counter: 2
2024-09-16 15:56:26,071 - MainThread cursor.py:641 - _execute_helper() - DEBUG - Request id: x
2024-09-16 15:56:26,071 - MainThread cursor.py:643 - _execute_helper() - DEBUG - running query [SELECT count(1) AS "COUNT(LITERAL())" FROM (select query_start_time from snowfla...]
2024-09-16 15:56:26,071 - MainThread cursor.py:650 - _execute_helper() - DEBUG - is_file_transfer: True
2024-09-16 15:56:26,071 - MainThread connection.py:1312 - cmd_query() - DEBUG - _cmd_query
2024-09-16 15:56:26,071 - MainThread _query_context_cache.py:155 - serialize_to_dict() - DEBUG - serialize_to_dict() called
2024-09-16 15:56:26,071 - MainThread _query_context_cache.py:276 - log_cache_entries() - DEBUG - Cache Entry: (0, 1726516585060215, 0)
2024-09-16 15:56:26,071 - MainThread _query_context_cache.py:180 - serialize_to_dict() - DEBUG - serialize_to_dict(): data to send to server {'entries': [{'id': 0, 'timestamp': 1726516585060215, 'priority': 0, 'context': {}}]}
2024-09-16 15:56:26,071 - MainThread connection.py:1341 - cmd_query() - DEBUG - sql=[SELECT count(1) AS "COUNT(LITERAL())" FROM (select query_start_time from snowfla...], sequence_id=[2], is_file_transfer=[False]
2024-09-16 15:56:26,072 - MainThread network.py:487 - request() - DEBUG - Opentelemtry otel injection failed because of: No module named 'opentelemetry'
2024-09-16 15:56:26,072 - MainThread network.py:1187 - _use_requests_session() - DEBUG - Session status for SessionPool 'x.snowflakecomputing.com', SessionPool 1/1 active sessions
2024-09-16 15:56:26,072 - MainThread network.py:886 - _request_exec_wrapper() - DEBUG - remaining request timeout: N/A ms, retry cnt: 1
2024-09-16 15:56:26,072 - MainThread network.py:868 - add_request_guid() - DEBUG - Request guid: x
2024-09-16 15:56:26,072 - MainThread network.py:1046 - _request_exec() - DEBUG - socket timeout: 60
2024-09-16 15:56:28,213 - MainThread connectionpool.py:474 - _make_request() - DEBUG - https://x.snowflakecomputing.com:443 "POST /queries/v1/query-request?requestId=x&request_guid=x HTTP/1.1" 200 None
2024-09-16 15:56:28,214 - MainThread network.py:1073 - _request_exec() - DEBUG - SUCCESS
2024-09-16 15:56:28,215 - MainThread network.py:1192 - _use_requests_session() - DEBUG - Session status for SessionPool 'x.snowflakecomputing.com', SessionPool 0/1 active sessions
2024-09-16 15:56:28,215 - MainThread network.py:750 - _post_request() - DEBUG - ret[code] = None, after post request
2024-09-16 15:56:28,215 - MainThread network.py:776 - _post_request() - DEBUG - Query id: x
2024-09-16 15:56:28,215 - MainThread _query_context_cache.py:191 - deserialize_json_dict() - DEBUG - deserialize_json_dict() called: data from server: {'entries': [{'id': 0, 'timestamp': 1726516588190659, 'priority': 0}]}
2024-09-16 15:56:28,215 - MainThread _query_context_cache.py:276 - log_cache_entries() - DEBUG - Cache Entry: (0, 1726516585060215, 0)
2024-09-16 15:56:28,215 - MainThread _query_context_cache.py:232 - deserialize_json_dict() - DEBUG - deserialize {'id': 0, 'timestamp': 1726516588190659, 'priority': 0}
2024-09-16 15:56:28,215 - MainThread _query_context_cache.py:101 - _sync_priority_map() - DEBUG - sync_priority_map called priority_map size = 0, new_priority_map size = 1
2024-09-16 15:56:28,215 - MainThread _query_context_cache.py:127 - trim_cache() - DEBUG - trim_cache() called. treeSet size is 1 and cache capacity is 5
2024-09-16 15:56:28,215 - MainThread _query_context_cache.py:136 - trim_cache() - DEBUG - trim_cache() returns. treeSet size is 1 and cache capacity is 5
2024-09-16 15:56:28,215 - MainThread _query_context_cache.py:271 - deserialize_json_dict() - DEBUG - deserialize_json_dict() returns
2024-09-16 15:56:28,215 - MainThread _query_context_cache.py:276 - log_cache_entries() - DEBUG - Cache Entry: (0, 1726516588190659, 0)
2024-09-16 15:56:28,216 - MainThread cursor.py:990 - execute() - DEBUG - sfqid: x
2024-09-16 15:56:28,216 - MainThread cursor.py:996 - execute() - DEBUG - query execution done
2024-09-16 15:56:28,216 - MainThread cursor.py:1010 - execute() - DEBUG - SUCCESS
2024-09-16 15:56:28,216 - MainThread cursor.py:1029 - execute() - DEBUG - PUT OR GET: False
2024-09-16 15:56:28,216 - MainThread cursor.py:1142 - _init_result_and_meta() - DEBUG - Query result format: arrow
2024-09-16 15:56:28,216 - MainThread cursor.py:1156 - _init_result_and_meta() - INFO - Number of results in first chunk: 1
2024-09-16 15:56:28,216 - MainThread server_connection.py:421 - run_query() - DEBUG - Execute query [queryID: x]  SELECT count(1) AS "COUNT(LITERAL())" FROM (select query_start_time from snowflake.account_usage.access_history limit 1) LIMIT 1
2024-09-16 15:56:28,216 - MainThread result_batch.py:68 - _create_nanoarrow_iterator() - DEBUG - Using nanoarrow as the arrow data converter
2024-09-16 15:56:28,216 - MainThread CArrowIterator.cpp:120 - CArrowIterator() - DEBUG - Arrow BatchSize: 1
2024-09-16 15:56:28,216 - MainThread CArrowChunkIterator.cpp:46 - CArrowChunkIterator() - DEBUG - Arrow chunk info: batchCount 1, columnCount 1, use_numpy: 0
2024-09-16 15:56:28,216 - MainThread nanoarrow_arrow_iterator.cpython-311-darwin.so:0 - __cinit__() - DEBUG - Batches read: 0
2024-09-16 15:56:28,216 - MainThread result_set.py:87 - result_set_iterator() - DEBUG - beginning to schedule result batch downloads
2024-09-16 15:56:28,216 - MainThread CArrowChunkIterator.cpp:70 - next() - DEBUG - Current batch index: 0, rows in current batch: 1
count=1

query='select query_start_time from snowflake.account_usage.access_history limit 1;'
2024-09-16 15:56:28,218 - MainThread cursor.py:916 - execute() - DEBUG - executing SQL/command
2024-09-16 15:56:28,218 - MainThread cursor.py:931 - execute() - DEBUG - query: [select query_start_time from snowflake.account_usage.access_history limit 1;]
2024-09-16 15:56:28,218 - MainThread connection.py:1651 - _next_sequence_counter() - DEBUG - sequence counter: 3
2024-09-16 15:56:28,218 - MainThread cursor.py:641 - _execute_helper() - DEBUG - Request id: x
2024-09-16 15:56:28,218 - MainThread cursor.py:643 - _execute_helper() - DEBUG - running query [select query_start_time from snowflake.account_usage.access_history limit 1;]
2024-09-16 15:56:28,218 - MainThread cursor.py:650 - _execute_helper() - DEBUG - is_file_transfer: True
2024-09-16 15:56:28,218 - MainThread connection.py:1312 - cmd_query() - DEBUG - _cmd_query
2024-09-16 15:56:28,218 - MainThread _query_context_cache.py:155 - serialize_to_dict() - DEBUG - serialize_to_dict() called
2024-09-16 15:56:28,218 - MainThread _query_context_cache.py:276 - log_cache_entries() - DEBUG - Cache Entry: (0, 1726516588190659, 0)
2024-09-16 15:56:28,219 - MainThread _query_context_cache.py:180 - serialize_to_dict() - DEBUG - serialize_to_dict(): data to send to server {'entries': [{'id': 0, 'timestamp': 1726516588190659, 'priority': 0, 'context': {}}]}
2024-09-16 15:56:28,219 - MainThread connection.py:1341 - cmd_query() - DEBUG - sql=[select query_start_time from snowflake.account_usage.access_history limit 1;], sequence_id=[3], is_file_transfer=[False]
2024-09-16 15:56:28,220 - MainThread network.py:487 - request() - DEBUG - Opentelemtry otel injection failed because of: No module named 'opentelemetry'
2024-09-16 15:56:28,220 - MainThread network.py:1187 - _use_requests_session() - DEBUG - Session status for SessionPool 'x.snowflakecomputing.com', SessionPool 1/1 active sessions
2024-09-16 15:56:28,220 - MainThread network.py:886 - _request_exec_wrapper() - DEBUG - remaining request timeout: N/A ms, retry cnt: 1
2024-09-16 15:56:28,220 - MainThread network.py:868 - add_request_guid() - DEBUG - Request guid: x
2024-09-16 15:56:28,220 - MainThread network.py:1046 - _request_exec() - DEBUG - socket timeout: 60
2024-09-16 15:56:30,058 - MainThread connectionpool.py:474 - _make_request() - DEBUG - https://x.snowflakecomputing.com:443 "POST /queries/v1/query-request?requestId=x&request_guid=x HTTP/1.1" 200 None
2024-09-16 15:56:30,060 - MainThread network.py:1073 - _request_exec() - DEBUG - SUCCESS
2024-09-16 15:56:30,060 - MainThread network.py:1192 - _use_requests_session() - DEBUG - Session status for SessionPool 'x.snowflakecomputing.com', SessionPool 0/1 active sessions
2024-09-16 15:56:30,060 - MainThread network.py:750 - _post_request() - DEBUG - ret[code] = None, after post request
2024-09-16 15:56:30,060 - MainThread network.py:776 - _post_request() - DEBUG - Query id: x
2024-09-16 15:56:30,060 - MainThread _query_context_cache.py:191 - deserialize_json_dict() - DEBUG - deserialize_json_dict() called: data from server: {'entries': [{'id': 0, 'timestamp': 1726516590034491, 'priority': 0}]}
2024-09-16 15:56:30,060 - MainThread _query_context_cache.py:276 - log_cache_entries() - DEBUG - Cache Entry: (0, 1726516588190659, 0)
2024-09-16 15:56:30,060 - MainThread _query_context_cache.py:232 - deserialize_json_dict() - DEBUG - deserialize {'id': 0, 'timestamp': 1726516590034491, 'priority': 0}
2024-09-16 15:56:30,061 - MainThread _query_context_cache.py:101 - _sync_priority_map() - DEBUG - sync_priority_map called priority_map size = 0, new_priority_map size = 1
2024-09-16 15:56:30,061 - MainThread _query_context_cache.py:127 - trim_cache() - DEBUG - trim_cache() called. treeSet size is 1 and cache capacity is 5
2024-09-16 15:56:30,061 - MainThread _query_context_cache.py:136 - trim_cache() - DEBUG - trim_cache() returns. treeSet size is 1 and cache capacity is 5
2024-09-16 15:56:30,061 - MainThread _query_context_cache.py:271 - deserialize_json_dict() - DEBUG - deserialize_json_dict() returns
2024-09-16 15:56:30,061 - MainThread _query_context_cache.py:276 - log_cache_entries() - DEBUG - Cache Entry: (0, 1726516590034491, 0)
2024-09-16 15:56:30,061 - MainThread cursor.py:990 - execute() - DEBUG - sfqid: x
2024-09-16 15:56:30,061 - MainThread cursor.py:996 - execute() - DEBUG - query execution done
2024-09-16 15:56:30,061 - MainThread cursor.py:1010 - execute() - DEBUG - SUCCESS
2024-09-16 15:56:30,061 - MainThread cursor.py:1029 - execute() - DEBUG - PUT OR GET: False
2024-09-16 15:56:30,061 - MainThread cursor.py:1142 - _init_result_and_meta() - DEBUG - Query result format: arrow
2024-09-16 15:56:30,061 - MainThread cursor.py:1156 - _init_result_and_meta() - INFO - Number of results in first chunk: 1
2024-09-16 15:56:30,062 - MainThread server_connection.py:421 - run_query() - DEBUG - Execute query [queryID: x] select query_start_time from snowflake.account_usage.access_history limit 1;
2024-09-16 15:56:30,062 - MainThread result_batch.py:68 - _create_nanoarrow_iterator() - DEBUG - Using nanoarrow as the arrow data converter
2024-09-16 15:56:30,062 - MainThread CArrowIterator.cpp:120 - CArrowIterator() - DEBUG - Arrow BatchSize: 1
2024-09-16 15:56:30,062 - MainThread CArrowChunkIterator.cpp:46 - CArrowChunkIterator() - DEBUG - Arrow chunk info: batchCount 1, columnCount 1, use_numpy: 0
2024-09-16 15:56:30,062 - MainThread nanoarrow_arrow_iterator.cpython-311-darwin.so:0 - __cinit__() - DEBUG - Batches read: 0
2024-09-16 15:56:30,062 - MainThread result_set.py:87 - result_set_iterator() - DEBUG - beginning to schedule result batch downloads
2024-09-16 15:56:30,062 - MainThread CArrowChunkIterator.cpp:70 - next() - DEBUG - Current batch index: 0, rows in current batch: 1
result=[Row(QUERY_START_TIME=datetime.datetime(2023, 9, 9, 10, 2, 48, 666000, tzinfo=<DstTzInfo 'America/Los_Angeles' PDT-1 day, 17:00:00 DST>))]
2024-09-16 15:56:30,064 - MainThread cursor.py:916 - execute() - DEBUG - executing SQL/command
2024-09-16 15:56:30,064 - MainThread cursor.py:931 - execute() - DEBUG - query: [SELECT count(1) AS "COUNT(LITERAL())" FROM (select query_start_time from snowfla...]
2024-09-16 15:56:30,064 - MainThread connection.py:1651 - _next_sequence_counter() - DEBUG - sequence counter: 4
2024-09-16 15:56:30,064 - MainThread cursor.py:641 - _execute_helper() - DEBUG - Request id: x
2024-09-16 15:56:30,064 - MainThread cursor.py:643 - _execute_helper() - DEBUG - running query [SELECT count(1) AS "COUNT(LITERAL())" FROM (select query_start_time from snowfla...]
2024-09-16 15:56:30,064 - MainThread cursor.py:650 - _execute_helper() - DEBUG - is_file_transfer: True
2024-09-16 15:56:30,064 - MainThread connection.py:1312 - cmd_query() - DEBUG - _cmd_query
2024-09-16 15:56:30,064 - MainThread _query_context_cache.py:155 - serialize_to_dict() - DEBUG - serialize_to_dict() called
2024-09-16 15:56:30,064 - MainThread _query_context_cache.py:276 - log_cache_entries() - DEBUG - Cache Entry: (0, 1726516590034491, 0)
2024-09-16 15:56:30,065 - MainThread _query_context_cache.py:180 - serialize_to_dict() - DEBUG - serialize_to_dict(): data to send to server {'entries': [{'id': 0, 'timestamp': 1726516590034491, 'priority': 0, 'context': {}}]}
2024-09-16 15:56:30,065 - MainThread connection.py:1341 - cmd_query() - DEBUG - sql=[SELECT count(1) AS "COUNT(LITERAL())" FROM (select query_start_time from snowfla...], sequence_id=[4], is_file_transfer=[False]
2024-09-16 15:56:30,065 - MainThread network.py:487 - request() - DEBUG - Opentelemtry otel injection failed because of: No module named 'opentelemetry'
2024-09-16 15:56:30,066 - MainThread network.py:1187 - _use_requests_session() - DEBUG - Session status for SessionPool 'x.snowflakecomputing.com', SessionPool 1/1 active sessions
2024-09-16 15:56:30,066 - MainThread network.py:886 - _request_exec_wrapper() - DEBUG - remaining request timeout: N/A ms, retry cnt: 1
2024-09-16 15:56:30,066 - MainThread network.py:868 - add_request_guid() - DEBUG - Request guid: x
2024-09-16 15:56:30,066 - MainThread network.py:1046 - _request_exec() - DEBUG - socket timeout: 60
2024-09-16 15:56:30,205 - MainThread connectionpool.py:474 - _make_request() - DEBUG - https://x.snowflakecomputing.com:443 "POST /queries/v1/query-request?requestId=x&request_guid=x HTTP/1.1" 200 None
2024-09-16 15:56:30,207 - MainThread network.py:1073 - _request_exec() - DEBUG - SUCCESS
2024-09-16 15:56:30,207 - MainThread network.py:1192 - _use_requests_session() - DEBUG - Session status for SessionPool 'x.snowflakecomputing.com', SessionPool 0/1 active sessions
2024-09-16 15:56:30,207 - MainThread network.py:750 - _post_request() - DEBUG - ret[code] = 001003, after post request
2024-09-16 15:56:30,207 - MainThread network.py:776 - _post_request() - DEBUG - Query id: x
2024-09-16 15:56:30,207 - MainThread cursor.py:990 - execute() - DEBUG - sfqid: x
2024-09-16 15:56:30,207 - MainThread cursor.py:996 - execute() - DEBUG - query execution done
2024-09-16 15:56:30,207 - MainThread cursor.py:1071 - execute() - DEBUG - {'data': {'internalError': False, 'unredactedFromSecureObject': False, 'errorCode': '001003', 'age': 0, 'sqlState': '42000', 'queryId': 'x', 'line': -1, 'pos': -1, 'type': 'COMPILATION'}, 'code': '001003', 'message': "SQL compilation error:\nsyntax error line 1 at position 119 unexpected ';'.", 'success': False, 'headers': None}
ex=SnowparkSQLException("001003 (42000): x: SQL compilation error:\nsyntax error line 1 at position 119 unexpected ';'.", '1304', 'x')
2024-09-16 15:56:30,257 - MainThread connection.py:788 - close() - INFO - closed
2024-09-16 15:56:30,257 - MainThread telemetry.py:211 - close() - DEBUG - Closing telemetry client.
2024-09-16 15:56:30,257 - MainThread connection.py:794 - close() - INFO - No async queries seem to be running, deleting session
2024-09-16 15:56:30,257 - MainThread network.py:1187 - _use_requests_session() - DEBUG - Session status for SessionPool 'x.snowflakecomputing.com', SessionPool 1/1 active sessions
2024-09-16 15:56:30,257 - MainThread network.py:886 - _request_exec_wrapper() - DEBUG - remaining request timeout: 5000 ms, retry cnt: 1
2024-09-16 15:56:30,257 - MainThread network.py:868 - add_request_guid() - DEBUG - Request guid: x
2024-09-16 15:56:30,257 - MainThread network.py:1046 - _request_exec() - DEBUG - socket timeout: 60
2024-09-16 15:56:30,431 - MainThread connectionpool.py:474 - _make_request() - DEBUG - https://x.snowflakecomputing.com:443 "POST /session?delete=true&request_guid=x HTTP/1.1" 200 None
2024-09-16 15:56:30,433 - MainThread network.py:1073 - _request_exec() - DEBUG - SUCCESS
2024-09-16 15:56:30,433 - MainThread network.py:1192 - _use_requests_session() - DEBUG - Session status for SessionPool 'x.snowflakecomputing.com', SessionPool 0/1 active sessions
2024-09-16 15:56:30,433 - MainThread network.py:750 - _post_request() - DEBUG - ret[code] = None, after post request
2024-09-16 15:56:30,437 - MainThread _query_context_cache.py:141 - clear_cache() - DEBUG - clear_cache() called
2024-09-16 15:56:30,437 - MainThread connection.py:807 - close() - DEBUG - Session is closed
2024-09-16 15:56:30,438 - MainThread session.py:594 - close() - DEBUG - No-op because session x had been previously closed.
2024-09-16 15:56:30,438 - MainThread connection.py:779 - close() - DEBUG - Rest object has been destroyed, cannot close session
2024-09-16 15:56:30,438 - MainThread session.py:607 - close() - INFO - Closed session: x
sfc-gh-sghosh commented 1 month ago

Hello @Tim-Kracht ,

Thanks for raising the issue, we are looking into it, will update.

Regards, Sujan

sfc-gh-sghosh commented 1 month ago

Hello @Tim-Kracht ,

The semicolon (;) is typically used to terminate SQL statements in interactive environments like SnowSQL, Snowflake Worksheets, or UI-based tools, where multiple queries can be executed in sequence.

However, when executing SQL queries programmatically (such as through the Snowpark API, Python connectors, or other programmatic interfaces), the semicolon is not required and can lead to errors if included. Programmatic interfaces treat each query separately, so the semicolon is unnecessary and can cause syntax errors.

To fix the issue: You can remove the semi-colon or clean_query = query.rstrip(';')

queries = (
    "select query_start_time from snowflake.account_usage.access_history limit 1",
    "select query_start_time from snowflake.account_usage.access_history limit 1;"
)

for query in queries:
    print()
    df = None
    result = None
    count = None

    # Remove trailing semicolon if it exists
    clean_query = query.rstrip(';')

    print(f"{clean_query=}")
    df = session.sql(clean_query)

    try:
        result = df.collect()
        print(f"{result=}")
    except Exception as ex:
        print(f"{ex=}")

    try:
        count = df.count()
        print(f"{count=}")
    except Exception as ex:
        print(f"{ex=}")

Regards, Sujan