I'm trying to create a custom UDF that I can pass a dataframe to it and load the dataframe into a sql server table. This is what I have so far. I created a custom code with the run function that is going to be in the custom jar:
object SomeObject {
def run(ss: org.apache.spark.sql.SparkSession, metricName: String, dataFrameName: String, params: Option[Map[String, String]]): Unit = {
val server_name = "jdbc:sqlserver://{SERVER_ADDR}"
val database_name = "database_name"
val url = server_name + ";" + "databaseName=" + database_name + ";"
val table_name = "table_name"
val username = "username"
val password = "password"
df_name_here.write
.format("com.microsoft.sqlserver.jdbc.spark")
.mode("overwrite")
.option("url", url)
.option("dbtable", table_name)
.option("user", username)
.option("password", password)
.save()
}
}
Note: The reason I'm using custom code is so I can be able to use this format "com.microsoft.sqlserver.jdbc.spark".
Can you please help figure out how to pass the dataframe to the function so I can replace with it?
Unless, I can use standard JDBC output and specify "format("com.microsoft.sqlserver.jdbc.spark")".
Hello,
Is it possible to use Apache Spark Connector for SQL Server with metorikku?
https://github.com/microsoft/sql-spark-connector
I'm trying to create a custom UDF that I can pass a dataframe to it and load the dataframe into a sql server table. This is what I have so far. I created a custom code with the run function that is going to be in the custom jar:
Note: The reason I'm using custom code is so I can be able to use this format "com.microsoft.sqlserver.jdbc.spark".
Can you please help figure out how to pass the dataframe to the function so I can replace with it?
Unless, I can use standard JDBC output and specify "format("com.microsoft.sqlserver.jdbc.spark")".
Not sure if it's possible since it takes the value from the driver: https://github.com/YotpoLtd/metorikku/blob/master/src/main/scala/com/yotpo/metorikku/output/writers/jdbc/JDBCOutputWriter.scala#L42
Thank you