Closed scottfleming closed 4 months ago
Hello @scottfleming ,
Thanks for raising the issue, we are looking into it, will update.
Regards, Sujan
Hello @scottfleming ,
I tried to reproduce the issue with Python connector 3.6.0 and Python 3.11, but it did not throw any collision with stage names.
Could you share the code snippet to reproduce the issue.
`import random import snowflake.connector import logging
for logger_name in ['snowflake.connector', 'botocore', 'boto3']: logger = logging.getLogger(logger_name) logger.setLevel(logging.DEBUG) ch = logging.FileHandler('/Users/sghosh/IdeaProjects/Python_prj/tmp/seed.log') ch.setLevel(logging.DEBUG) ch.setFormatter(logging.Formatter('%(asctime)s - %(threadName)s %(filename)s:%(lineno)d - %(funcName)s() - %(levelname)s - %(message)s')) logger.addHandler(ch)
def reset_seed(): random.seed(42)
def another_function(): reset_seed() # Reset seed before calling write_pandas
def simulate_data_processing(connection): cursor = connection.cursor() try: cursor.execute("CREATE or replace table my_table (column1 INT, column2 STRING)") data = [(1, 'a'), (2, 'b'), (3, 'c')]
for row in data:
cursor.execute("INSERT INTO my_table VALUES (%s, %s)", row)
connection.commit() # Commit the transaction
# Fetch values from the table
cursor.execute("SELECT * FROM my_table")
rows = cursor.fetchall()
print("Values from the table:")
for row in rows:
print(row)
finally:
cursor.close()
def main():
random.seed(123)
# Connect to Snowflake
conn = snowflake.connector.connect(
user='xxx',
password='xxxx',
account='xxxx',
warehouse='SUJAN_WH',
database='SAMPLEDATABASE',
schema='TEST'
)
for i in range(5): # Repeat the process multiple times
another_function() # Call another function that resets the seed
simulate_data_processing(conn) # Simulate data processing
conn.close() # Close the Snowflake connection
if name == "main": main() `
Regards, Sujan
Hello @scottfleming ,
Closing the issue as we did not receive any response, and the issue is not getting reproduced. You can reopen when able to reproduce and provide the required artifacts.
Regards, Sujan
Python version
Python 3.11.6 (main, Oct 8 2023, 05:06:43) [GCC 13.2.0]
Operating system and processor architecture
Linux-6.6.16-linuxkit-aarch64-with-glibc2.38
Installed packages
What did you do?
I call
write_pandas
in a loop, but there's another function from a different package in that same loop that resets the global seed to a specific state (I recognize this is poor form and am working on a fix for that separately).The problem is that
snowflake-connector-python
's_create_temp_stage
function callsrandom_string
to generate temporary stage names, over which the user does not have control:https://github.com/snowflakedb/snowflake-connector-python/blob/8cdca67faf3fa2a6d31a545d78326b76c31c4c64/src/snowflake/connector/pandas_tools.py#L95
random_string
in turn relies on therandom
module:https://github.com/snowflakedb/snowflake-connector-python/blob/8cdca67faf3fa2a6d31a545d78326b76c31c4c64/test/randomize.py#L30
so if the user sets the seed deterministically outside the scope of calling
write_pandas
then this will result in stage name collisions throwing an opaque error such as the following:which triggers on line 81 in
_do_create_temp_stage
: https://github.com/snowflakedb/snowflake-connector-python/blob/8cdca67faf3fa2a6d31a545d78326b76c31c4c64/src/snowflake/connector/pandas_tools.py#L81What did you expect to see?
What I wish would have happened instead is that there would be one of the following:
stage_name
Can you set logging to DEBUG and collect the logs?
No response