Open mauriciojost opened 11 months ago
Hi @mauriciojost, did you solve this issue? I encountered the same error a few weeks ago and could not find a solution.
For now all is working perfectly with this streaming option:
...
.option("params.isExecuteQueryWithSyncMode", "true")
...
It completely bypasses the call status async wait code section, which is where the original issue occurs (see stack-trace).
@mauriciojost thanks for the update. I also noticed that it seems the root cause of this issue is from snowflake JDBC and it was confirmed that is something needed to improve.
We're using this library, in version
v2.12.0-spark_3.4
. We're running a Spark 3.4 streaming query withforeachBatch(...)
in append mode that writes into Snowflake.The function called by
foreachBatch(...)
looks something like this:While running the query, we observe failures with this stacktrace:
Interestingly we observe that despite the failures, everything is correctly pushed to Snowflake. The failures occur randomly, we failed to correlate it with variables like number of rows transfered, duration or write, etc.
We understand that
StageWriter.executeCopyIntoTable(...)
(https://github.com/snowflakedb/spark-snowflake/blob/v2.12.0-spark_3.4/src/main/scala/net/snowflake/spark/snowflake/io/StageWriter.scala#L489) can be run in two modes, depending on the value ofparams.isExecuteQueryWithSyncMode
.params.isExecuteQueryWithSyncMode
to bypass that section of the code, if not, why?Thanks for your help,
Mauricio & @danjok