Azure-Samples / Synapse

Samples for Azure Synapse Analytics
MIT License
375 stars 351 forks source link

Movefromsnowflake #113

Closed mlevin19 closed 3 years ago

mlevin19 commented 3 years ago

Added Scala Notebook that migrates Snowflake schema to Synapse dedicated pool. ADF , as well as Synapse pipeline requires multiple step to get data from Snowflake table to the table in SQL dedicated pool in Synapse. You need first copy data to BLOB storage and then you load the file from BLOB storage into Synapse table. This notebook is using simple approach reading data from Snowflake table for given schema into Spark Data Frame and then write this Data Frame into Synapse SQL dedicated pool table. It uses schema name as parameter, providing ability to move all tables from given schema to SQL dedicated pool schema. The stored procedure that initiate database objects provided in the comments.

ghost commented 3 years ago

CLA assistant check
All CLA requirements met.

ruixinxu commented 3 years ago

hi @mlevin19 thanks for sending the PR, here are some suggestions to improve it.

  1. cell 1. Please add the link to download spark-snowflake_2.12-2.9.0-spark_3.1.jar and snowflake-jdbc-3.13.6.jar .
  2. cell 1. Please add the link to the instruction how to add customized jars to cluster/session packages
  3. cell 2, Please change the schema value to parameter that user need to specify .e.g. val sfschema = ""
  4. cell 3, Please add a link to instruction of configuring Azure Key Vault.
  5. cell 3, Please change the get secret input as parameters user need to specify. e.g. mssparkutils.credentials.getSecret("azure key vault name","secret name","linked service name")
mlevin19 commented 3 years ago

Hi @ruixinxu , I added all your recommendations. Please review and let me know if anything need to be changed/corrected. Thanks!