Unsupported Type: Apache Spark pools

Azure-Player / azure.synapse.tools

PowerShell module to deploy Synapse workspace (and more) in Microsoft Azure.

MIT License

20 stars 8 forks source link

Unsupported Type: Apache Spark pools #11

Closed DaveRiddell closed 2 years ago

DaveRiddell commented 2 years ago

Error occurs when deploying notebooks with a spark pool defined. Sample of notebook json:

{
    "name": "my_notebook",
    "properties": {
        "folder": {
            "name": "my_folder"
        },
        "nbformat": 4,
        "nbformat_minor": 2,
        "bigDataPool": {
            "referenceName": "sspklbdp01",
            "type": "BigDataPoolReference"
        },
            ....

Returns error:

VERBOSE: Analyzing notebook dependencies...
VERBOSE: Folder: D:\a\1\b\synapse_deploy\notebook
VERBOSE: - my_notebook.json
##[debug]Error record:
##[debug]
##[debug]Exception: C:\Users\VssAdministrator\Documents\PowerShell\Modules\azure.synapse.tools\0.18.0\private\Import-SynapseObjects.ps1:20
##[debug]Line |
##[debug]  20 |      Get-ChildItem "$folder" -Filter "*.json" | Where-Object { !$_.Nam …
##[debug]     |      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##[debug]     | ASWT0029: Unknown object type: BigDataPool.

azure.synapse.tools Version 0.18.000

NowinskiK commented 2 years ago

BigDataPool = Apache Spark pools README doc updated.

NowinskiK commented 2 years ago

For some reason Apache Spark Pool created in Azure Synapse:

IS NOT available in the repository, but notebooks have references to it
IS available in ARM Template file generated in "workspace_publish" branch when publishing I sent an enquiry to the Microsoft Product Group.

lidroz commented 2 years ago

Same problem here. I assume error message is thrown by !SynapseObject.class.ps1 cause BigDataPool is not listed in $AllowedTypes. Can we include BigDataPool as AllowedTypes even though I know Spark pools are not deployed. As long as we can pass this check I can ignore broken dependency with IgnoreLackOfReferencedObject = $true.

P.S. You're best Kamil! You saved me days with azure.datafactory.tools and azure.synapse.tools is just right on time again :)

lidroz commented 2 years ago

My local workaround that allows me to deploy Notebooks with default SparkPool when a pool with the same name already exists in destination workspace became:

adding 'BigDataPool' as $AllowedTypes in !SynapseObject.class.ps1
adding following snippet in switch statement in Get-SynapseObjectByName.ps1: 'BigDataPool' { $r = New-Object -TypeName SynapseObject $r.Name = $name $r.Type = "BigDataPool" $r.Deployed = $true
}

Notebook gets deployed in the destination workspace but default SparkPool is not set cause I have to change id and endpoint in a365ComputeOptions in the notebook to the SparkPool in the destination workspace. That could be fixed with environment config file.