Synapse Option - Githubissues

I spent some time looking into the manual deployment of Synapse in this context. A few things I found out that might be useful regarding the notebooks that are used in Databricks (main.ipynb and common.ipynb) that need transferring to Synapse:

Keyvault connection:

In Databricks/common.ipynb you are getting the secret for the service principal: client_secret = dbutils.secrets.get(keyvault, "clientsecret") In Synapse, you need to add your Keyvault als a linked service, afterwards in Synapse/common.ipynb you can do the same by: client_secret = TokenLibrary.getSecret("kvengzxq4fl", "clientsecret")

Connecting to the ADLS:

The following lines in Databricks/common.ipynb should be obsolete in Synapse/common.ipynb because in the Synapse case you want to use the ADLS that is Synapses workspace default storage.

accountName = engine["storageName"] # from engine.storageName
accountKey = "dsLake-" + engine["storageName"] # from engine.storageName

# Get the secret value
accountKeyValue = dbutils.secrets.get(keyvault, accountKey)

# set the token for accessing input and output path
spark.conf.set("fs.azure.account.key." + accountName + ".dfs.core.windows.net", accountKeyValue)

Running Synapse/common.ipynb from Synapse/main.ipynb

In Databricks/main.ipynb you run the common notebook with: %run "./common" According to documentation (https://docs.microsoft.com/en-us/azure/synapse-analytics/spark/apache-spark-development-using-notebooks?tabs=preview#notebook-reference) in Synapse you should be able to use %run as well, however the documentation also gives: which leads to believe this will not work when using it in a pipeline (which is desired).

Running a notebook from another notebook in Synapse does work when you use: mssparkutils.notebook.run("common") and to check it did run you could use in Synapse/main.ipynb:

exitVal = mssparkutils.notebook.run("common")
print (exitVal)

when adding something like: mssparkutils.notebook.exit("Execution of common notebook is finished") to the last cell in Synapse/common.ipynb you can see that the notebook is executed. However, the functions exposed in Synapse/common.ipynb are not available from Synapse/main.ipynb. So it seems we don't get the context from that notebook back.

Proposed workaround:

Either merge the two notebooks into 1.
Use the py module that will replace the common notebook.

Mimetis / ProjectY

Synapse Option #7

Idea

Today

Expectation

Keyvault connection:

Connecting to the ADLS:

Running Synapse/common.ipynb from Synapse/main.ipynb