spark-redshift-community / spark-redshift

Performant Redshift data source for Apache Spark
Apache License 2.0
136 stars 62 forks source link

Print unloadSql and copySql statement #162

Closed melin closed 1 month ago

melin commented 1 month ago

Sometimes need to analyze task performance to see the generated sql pushdown conditions. redshift console is not easy to find

bsharifi commented 1 month ago

@melin, the connector does not log this information because it could contain sensitive customer information. However, there is a way to enable JDBC driver logging to Spark which will show the generated SQL statements (and other JDBC information) with sensitive information masked off:

val df = sqlContext.read
  .format("io.github.spark_redshift_community.spark.redshift")
  .options(rsOptions)
  .option("jdbc.LogLevel", "6")
  .option("jdbc.LogPath", "/dev/stdout")
  .option("dbtable", "MyTable")
  .load()