Closed ramyerrabotu closed 3 years ago
It's possible to share the external metastore between HDinsight / on-prem Hive to Databricks and Synapse.
Here are the steps:
dbutils.fs.mkdirs("dbfs:/databricks/init/")
dbutils.fs.put( "/databricks/init/external-metastore.sh", """#!/bin/sh |# Loads environment variables to determine the correct JDBC driver to use. |source /etc/environment |# Quoting the label (i.e. EOF) with single quotes to disable variable interpolation. |cat << 'EOF' > /databricks/driver/conf/00-custom-spark.conf |[driver] { | # Hive specific configuration options for metastores in the local mode. | "spark.hadoop.javax.jdo.option.ConnectionURL" = "jdbc:sqlserver://<
>.database.windows.net:1433;database=< >;encrypt=true;trustServerCertificate=true;create=false;loginTimeout=300" | "spark.hadoop.javax.jdo.option.ConnectionUserName" = "< >" | "spark.hadoop.javax.jdo.option.ConnectionPassword" = "< >" | "hive.metastore.schema.verification.record.version" = "true" | "spark.sql.hive.metastore.jars" = "maven" | "hive.metastore.schema.verification" = "true" | "spark.sql.hive.metastore.version" = "2.1.1" |EOF |# Add the JDBC driver separately since must use variable expansion to choose the correct |# driver version. |cat << EOF >> /databricks/driver/conf/00-custom-spark.conf | "spark.hadoop.javax.jdo.option.ConnectionDriverName" = "com.microsoft.sqlserver.jdbc.SQLServerDriver" |} |EOF |""".stripMargin, overwrite = true )
The Customers are expecting a common database for metadata for both hive and Databricks as the customer wants to use hive for some workloads and DB for other workloads