Closed stibu81 closed 2 weeks ago
Hello @stibu81 ,
Thanks for raising the request. Could you clarify the requirement?
At present we know that when write_pandas() is called, the function does not write the custom index to the Snowflake table. Instead, it ignores the index entirely because it is not a standard RangeIndex. In both cases, whether you have a custom index or a standard RangeIndex, the write_pandas() function does not save the index to the Snowflake table.
In case if custom index, it will throw a warning and index will not be written. Example:
`pandas_df = pd.DataFrame([(1, "Steve"), (2, "Bob")], columns=["id", "name"]) pandas_df.index = ['row1', 'row2'] # Custom index
session.write_pandas(pandas_df, "WRITE_PANDAS_TABLE2", auto_create_table=True, overwrite=True)
result_df=session.sql("SELECT * FROM WRITE_PANDAS_TABLE2").to_pandas() print(result_df)
output: UserWarning: Pandas Dataframe has non-standard index of type <class 'pandas.core.indexes.base.Index'> which will not be written. Consider changing the index to pd.RangeIndex(start=0,...,step=1) or call reset_index() to keep index as column(s) id name 0 1 Steve 1 2 Bob`
In case if RangeIndex, there will be no warning but index will be ignored. `pandas_df = pd.DataFrame([(1, "Steve"), (2, "Bob")], columns=["id", "name"]) session.write_pandas(pandas_df, "WRITE_PANDAS_TABLE2", auto_create_table=True, overwrite=True)
result_df=session.sql("SELECT * FROM WRITE_PANDAS_TABLE2").to_pandas() print(result_df)
output: id name 0 1 Steve 1 2 Bob`
Now whats your requirement with new option write_index=False ? you want an option to explicitly disable writing the index, even if it is a RangeIndex or if the DataFrame has no custom indexes?
Regards, Sujan
Hi Sujan
Thanks for you reply and the clarification. Indeed, using your examples I realised that no index is written. I made a simple mistake by forgetting that reset_index()
not only resets the index, but also adds it as a column to the DataFrame. And then, of course, that column is written to snowflake.
So, in the end, write_pandas()
is doing exactly what I would expect it to do and I was simply not careful enough. I'm sorry for the unnecessary confusion that I caused.
Best regards, Stefan
What is the current behavior?
The function
write_pandas()
writes the index of the pandas DataFrame into a column "INDEX", if the index has the expected format. (Otherwise, there is a warning and the index is not written.)What is the desired behavior?
It should be possible to set an argument (e.g., write_index=False) such that the index is never written.
How would this improve
snowflake-connector-python
?This would allow the user to not write the index, if it is not needed.
References and other background
No response