liprec / vsts-publish-adf

This extension adds Azure Data Factory release tasks to Azure Pipelines.
Other
28 stars 14 forks source link

##[error]Error deploying 'adla_retaillink_ls.json' ('Value' cannot be null.) #15

Closed patpicos closed 4 years ago

patpicos commented 6 years ago

Hi liprec

When deploying ADF artifacts from GitHub repo, i get the following error.

2018-10-02T01:19:59.7270721Z ##[section]Starting: ADF Deploy JSON files to $(datafactory)
2018-10-02T01:19:59.7275505Z ==============================================================================
2018-10-02T01:19:59.7275634Z Task         : Azure Data Factory Deployment
2018-10-02T01:19:59.7275757Z Description  : Deploy Azure Data Factory Datasets, Pipelines and/or Linked Services using JSON files
2018-10-02T01:19:59.7275848Z Version      : 1.1.11
2018-10-02T01:19:59.7275912Z Author       : Jan Pieter Posthuma
2018-10-02T01:19:59.7276000Z Help         : [More Information](https://github.com/liprec/vsts-publish-adf)
2018-10-02T01:19:59.7276081Z ==============================================================================
2018-10-02T01:20:05.6474402Z ##[command]Import-Module -Name C:\Program Files\WindowsPowerShell\Modules\AzureRM\2.1.0\AzureRM.psd1 -Global
2018-10-02T01:20:10.1393756Z ##[warning]The names of some imported commands from the module 'AzureRM.Websites' include unapproved verbs that might make them less discoverable. To find the commands with unapproved verbs, run the Import-Module command again with the Verbose parameter. For a list of approved verbs, type Get-Verb.
2018-10-02T01:20:10.1670681Z ##[warning]The names of some imported commands from the module '
AzureRM' include unapproved verbs that might make them less discoverable. To find the commands with unapproved verbs, run the Import-Module command again with the Verbose parameter. For a list of approved verbs, type Get-Verb.
2018-10-02T01:20:10.2353547Z ##[command]Import-Module -Name C:\Program Files\WindowsPowerShell\Modules\AzureRM.profile\2.1.0\AzureRM.profile.psd1 -Global
2018-10-02T01:20:10.3036133Z ##[command]Add-AzureRMAccount -ServicePrincipal -Tenant *** -Credential System.Management.Automation.PSCredential -Environment AzureCloud
2018-10-02T01:20:12.4787770Z ##[command]Select-AzureRMSubscription -SubscriptionId 1824bc1e-b99a-4dab-9a84-b0d5f05f83c7 -TenantId ***
2018-10-02T01:20:16.3696856Z Cleared all existing Trigger
2018-10-02T01:20:17.8295335Z Cleared all existing Pipeline
2018-10-02T01:20:19.4243194Z Cleared all existing Dataset
2018-10-02T01:20:20.8037971Z Cleared all existing Linked Service
2018-10-02T01:20:20.8195900Z Start deploying Linked Service
2018-10-02T01:20:20.8222896Z Found 8 Linked Service files
2018-10-02T01:20:21.0838768Z ##[error]Error deploying 'adla_retaillink_ls.json' ('Value' cannot be null.)
2018-10-02T01:20:21.1218503Z ##[section]Finishing: ADF Deploy JSON files to $(datafactory)

Looking at the contents of the ADLA Linked service...i dont see anything wrong. Should there be a more informative error to assist validation/research (keep in mind i do replace some of the values using a JSON Patch in my build pipeline. I've **** some of the sensitive parts (prefix only)

{
    "name": "adla_retaillink_ls",
    "properties": {
        "type": "AzureDataLakeAnalytics",
        "typeProperties": {
            "accountName": "azueus2devadlaretaillink",
            "servicePrincipalId": "*****4-c442-4e4b-b0ba-6c31be2d657d",
            "servicePrincipalKey": {
                "type": "AzureKeyVaultSecret",
                "store": {
                    "referenceName": "kvlt_dataplatform_ls",
                    "type": "LinkedServiceReference"
                },
                "secretName": "azueus2-adla-adladatalake-spnkey"
            },
            "tenant": "*******-d7f2-403e-b764-0dbdcf0505f6",
            "subscriptionId": "*******-b99a-4dab-9a84-b0d5f05f83c7",
            "resourceGroupName": "azu-eus2-dev-rg-IngestRetailLink"
        }
    }
}

is the deploy doing a check to see if the keys exists on Keyvault?

patpicos commented 6 years ago

Further details on the issue. When deploying linked services, there is an order to deploy the services. Since the plugin attempts to deploy linked services in alphabetical order, the ADLA is deployed before keyvault.

However, the ADLA linked service needs components from the KeyVault linked service and therefore fails.

This order succeeds:

Set-AzureRmDataFactoryV2LinkedService -ResourceGroupName "azu-eus2-dev-rg-IngestRetailLink" -DataFactoryName "azu-eus2-dev-df-IngestRetailLink-dev" -Name "kvlt_dataplatform_ls" -File kvlt_dataplatform_ls.json
Set-AzureRmDataFactoryV2LinkedService -ResourceGroupName "azu-eus2-dev-rg-IngestRetailLink" -DataFactoryName "azu-eus2-dev-df-IngestRetailLink-dev" -Name "adla_retaillink_ls" -File adla_retaillink_ls.json

This order fails:
Set-AzureRmDataFactoryV2LinkedService -ResourceGroupName "azu-eus2-dev-rg-IngestRetailLink" -DataFactoryName "azu-eus2-dev-df-IngestRetailLink-dev" -Name "adla_retaillink_ls" -File adla_retaillink_ls.json
Set-AzureRmDataFactoryV2LinkedService -ResourceGroupName "azu-eus2-dev-rg-IngestRetailLink" -DataFactoryName "azu-eus2-dev-df-IngestRetailLink-dev" -Name "kvlt_dataplatform_ls" -File kvlt_dataplatform_ls.json

Recommendation: Parsing the list of Linked Services and find any objects in the referenceName property. Those need to be deployed first.

I assume a similar type of gymnastic is required for other object types

liprec commented 6 years ago

At this moment the task is not looking or any dependencies in the JSON files to determine the order. The easiest way to do this by altering the files names by adding e.g. numeric prefixes, or by using multiple tasks. You can try the new V2 version of the task, because that task is using the name inside the JSON (fix for #12) instead of the filename, so you can force an fixed order by renaming the files. Also if you set the parallel option to 1 to make sure that the order is fixed.

patpicos commented 6 years ago

Tried V2 and its a move forward...

Now I have one pipeline giving me grief which Ill need to investigate. FYI - I did not need to rename files!

2018-10-02T11:18:08.8252137Z ============================================================================== 2018-10-02T11:18:08.8252273Z Task : Azure Data Factory Deployment 2018-10-02T11:18:08.8252350Z Description : Deploy Azure Data Factory Datasets, Pipelines and/or Linked Services using JSON files 2018-10-02T11:18:08.8252440Z Version : 2.0.3 2018-10-02T11:18:08.8252503Z Author : Jan Pieter Posthuma 2018-10-02T11:18:08.8252763Z Help : More Information 2018-10-02T11:18:08.8252829Z ============================================================================== 2018-10-02T11:18:10.9790266Z Found 8 linked service(s) definitions. 2018-10-02T11:18:10.9824838Z Deploy linked service 'adla_retaillink_ls'. 2018-10-02T11:18:10.9838056Z Deploy linked service 'adls_datalake_ls'. 2018-10-02T11:18:10.9846390Z Deploy linked service 'azuredatabricks_retaillink_ls'. 2018-10-02T11:18:10.9854257Z Deploy linked service 'kvlt_dataplatform_ls'. 2018-10-02T11:18:10.9901424Z Deploy linked service 'sqldb_batch_control_ls'. 2018-10-02T11:18:12.3879762Z Deploy linked service 'wasb_adlsrccode_ls'. 2018-10-02T11:18:12.3918458Z Deploy linked service 'wasb_retaillink_ls'. 2018-10-02T11:18:12.4152019Z Deploy linked service 'wasb_walmart_syn_blob_storage_ls'. 2018-10-02T11:18:13.7374600Z Found 9 dataset(s) definitions. 2018-10-02T11:18:13.7388008Z Deploy dataset 'adls_dynamic_retail_unzip_loc_sink_ds'. 2018-10-02T11:18:13.7396949Z Deploy dataset 'adls_file_level_stat_src_ds'. 2018-10-02T11:18:13.7404143Z Deploy dataset 'adls_walmart_catchup_batchid_list'. 2018-10-02T11:18:13.7409232Z Deploy dataset 'adls_walmart_historical_client_list'. 2018-10-02T11:18:13.7418580Z Deploy dataset 'azsqldb_batch_ctrl_src_ds'. 2018-10-02T11:18:14.7970628Z Deploy dataset 'azsqldb_file_level_stat_sink_ds'. 2018-10-02T11:18:14.9951797Z Deploy dataset 'wasb_common_epos_paramfile_src_ds'. 2018-10-02T11:18:15.0107940Z Deploy dataset 'wasb_dynamic_retail_hist_zip_src_ds'. 2018-10-02T11:18:15.0735187Z Deploy dataset 'wasb_dynamic_retail_zip_src_ds'. 2018-10-02T11:18:16.2906131Z Found 14 pipeline(s) definitions. 2018-10-02T11:18:16.2935140Z Deploy pipeline 'data_vault_hist_loop_pl'. 2018-10-02T11:18:16.2941519Z Deploy pipeline 'data_vault_hist_pl'. 2018-10-02T11:18:16.2946820Z Deploy pipeline 'data_vault_pl'. 2018-10-02T11:18:16.2953591Z Deploy pipeline 'raw_dynamic_retail_catchup_batches_copy_loop_pl'. 2018-10-02T11:18:16.2957888Z Deploy pipeline 'raw_dynamic_retail_catchup_copy_zip_pl'. 2018-10-02T11:18:17.3564356Z Deploy pipeline 'raw_dynamic_retail_catchup_metadata_copy_loop_pl'. 2018-10-02T11:18:17.4029182Z Deploy pipeline 'raw_dynamic_retail_catchup_usql_zip_loop_pl'. 2018-10-02T11:18:17.4051572Z ##[error]Error deploying 'raw_dynamic_retail_catchup_batches_copy_loop_pl' pipeline : {"error":{"code":"BadRequest","message":"The document creation or update failed because of invalid reference 'raw_dynamic_retail_catchup_metadata_copy_loop_pl'.","target":"/subscriptions/*****raw_dynamic_retail_catchup_batches_copy_loop_pl","details":null}} 2018-10-02T11:18:17.4103423Z Deploy pipeline 'raw_dynamic_retail_catchup_usql_zip_pl'. 2018-10-02T11:18:17.4379382Z Deploy pipeline 'raw_dynamic_retail_catchup_zip_bckup_pl'. 2018-10-02T11:18:18.4035946Z Deploy pipeline 'raw_dynamic_retail_hist_zip_loop_pl'. 2018-10-02T11:18:18.4448196Z Deploy pipeline 'raw_dynamic_retail_hist_zip_pl'. 2018-10-02T11:18:18.4555307Z Deploy pipeline 'raw_dynamic_retail_zip_pl'. 2018-10-02T11:18:18.4846894Z Deploy pipeline 'retail_link_catchup_master_pl'. 2018-10-02T11:18:18.7988490Z Deploy pipeline 'retail_link_master_pl'. 2018-10-02T11:18:19.8832548Z ##[section]Finishing: ADF Deploy JSON files to $(datafactory)

patpicos commented 6 years ago

I think I may have something that will help the sequencing of object deployments for both LinkedServices and Pipelines. Basically I parse the list of objects to see if they have dependencies. If they do, I put them later in the deployment aka...we deploy everything w/o dependencies).

See the snippet below and output:

Script:

#LinkedServices
Write-host "LinkedServices Deployment Order" -ForegroundColor Red
$files = Get-ChildItem .\linkedService -Filter *.json
$first = @()
$second = @()

foreach ($f in $files) {
    #$f = $files[0]
    $content = Get-Content $f.FullName |convertFrom-json
    if ($content | Where-Object { $_.properties.typeProperties.servicePrincipalKey.store.referenceName } ) {
        #Write-Host "$f has a dependency. Adding to second"
        $second += $f.FullName

    } else {
        #Write-Host "$f has NO dependency. Adding to first"
        $first += $f.FullName
    }
    #Write-Host "File $f $content"
}
Write-Host "Deploy First:"  -ForegroundColor Yellow
$first
Write-Host "Deploy Second:"  -ForegroundColor Yellow
$second

#Pipelines
Write-host ""
Write-host "Pipeline Deployment Order" -ForegroundColor Red
$files = Get-ChildItem .\pipeline -Filter *.json

$first = @()
$second = @()
$third =@()
#$f = $files[7]
foreach ($f in $files) {
    $content = Get-Content $f.FullName |convertFrom-json
    if ($content.properties.activities | where-object {$_.type -eq "ExecutePipeline" } ) {
        #Write-Host "$f has a dependency. Adding to second"
        $second += $f.FullName

    } elseif ($content.properties.activities | where-object {$_.type -eq "ForEach" } |where-object {$_.typeProperties.activities.type -eq "ExecutePipeline" } ) {
        write-host "Found a Foreach with pipeline in $f"
        $third += $f.FullName
    }  else {
        #Write-Host "$f has NO dependency. Adding to first"
        $first += $f.FullName
    }
    #Write-Host "File $f $content"
}
Write-Host "Deploy First:" -ForegroundColor Yellow
$first
Write-Host "Deploy Second:" -ForegroundColor Yellow
$second
Write-Host "Deploy Third:" -ForegroundColor Yellow
$third

Output:

LinkedServices Deployment Order
Deploy First:
C:\Users\Patrick.Picard\Documents\VSCode\ADF_Debug\linkedService\azuredatabricks_retaillink_ls.json
C:\Users\Patrick.Picard\Documents\VSCode\ADF_Debug\linkedService\kvlt_application_ls.json
C:\Users\Patrick.Picard\Documents\VSCode\ADF_Debug\linkedService\kvlt_dataplatform_ls.json
C:\Users\Patrick.Picard\Documents\VSCode\ADF_Debug\linkedService\sqldb_batch_control_ls.json
C:\Users\Patrick.Picard\Documents\VSCode\ADF_Debug\linkedService\wasb_adlsrccode_ls.json
C:\Users\Patrick.Picard\Documents\VSCode\ADF_Debug\linkedService\wasb_retaillink_ls.json
C:\Users\Patrick.Picard\Documents\VSCode\ADF_Debug\linkedService\wasb_walmart_syn_blob_storage_ls.json
Deploy Second:
C:\Users\Patrick.Picard\Documents\VSCode\ADF_Debug\linkedService\adla_retaillink_ls.json
C:\Users\Patrick.Picard\Documents\VSCode\ADF_Debug\linkedService\adls_datalake_ls.json

Pipeline Deployment Order
Found a Foreach with pipeline in data_vault_hist_loop_pl.json
Found a Foreach with pipeline in raw_dynamic_retail_catchup_batches_copy_loop_pl.json
Found a Foreach with pipeline in raw_dynamic_retail_catchup_metadata_copy_loop_pl.json
Found a Foreach with pipeline in raw_dynamic_retail_catchup_usql_zip_loop_pl.json
Found a Foreach with pipeline in raw_dynamic_retail_hist_zip_loop_pl.json
Deploy First:
C:\Users\Patrick.Picard\Documents\VSCode\ADF_Debug\pipeline\create_data_vault_ddl_pl.json
C:\Users\Patrick.Picard\Documents\VSCode\ADF_Debug\pipeline\data_vault_hist_pl.json
C:\Users\Patrick.Picard\Documents\VSCode\ADF_Debug\pipeline\data_vault_pl.json
C:\Users\Patrick.Picard\Documents\VSCode\ADF_Debug\pipeline\raw_dynamic_retail_catchup_copy_zip_pl.json
C:\Users\Patrick.Picard\Documents\VSCode\ADF_Debug\pipeline\raw_dynamic_retail_catchup_usql_zip_pl.json
C:\Users\Patrick.Picard\Documents\VSCode\ADF_Debug\pipeline\raw_dynamic_retail_catchup_zip_bckup_pl.json
C:\Users\Patrick.Picard\Documents\VSCode\ADF_Debug\pipeline\raw_dynamic_retail_hist_zip_pl.json
C:\Users\Patrick.Picard\Documents\VSCode\ADF_Debug\pipeline\raw_dynamic_retail_zip_pl.json
Deploy Second:
C:\Users\Patrick.Picard\Documents\VSCode\ADF_Debug\pipeline\retail_link_catchup_master_pl.json
C:\Users\Patrick.Picard\Documents\VSCode\ADF_Debug\pipeline\retail_link_master_pl.json
Deploy Third:
C:\Users\Patrick.Picard\Documents\VSCode\ADF_Debug\pipeline\data_vault_hist_loop_pl.json
C:\Users\Patrick.Picard\Documents\VSCode\ADF_Debug\pipeline\raw_dynamic_retail_catchup_batches_copy_loop_pl.json
C:\Users\Patrick.Picard\Documents\VSCode\ADF_Debug\pipeline\raw_dynamic_retail_catchup_metadata_copy_loop_pl.json
C:\Users\Patrick.Picard\Documents\VSCode\ADF_Debug\pipeline\raw_dynamic_retail_catchup_usql_zip_loop_pl.json
C:\Users\Patrick.Picard\Documents\VSCode\ADF_Debug\pipeline\raw_dynamic_retail_hist_zip_loop_pl.json
liprec commented 6 years ago

My idea behind the new version (V2) of this task is to minimize the dependency of the task code related to new features of ADF. My trigger was the introduction of folders in ADF, but the PowerShell Cmdlets didn't support yet. So the new version only use the ADF API endpoints and pushes the definition to those endpoint, so to say it short: the task are 'stupid' and they don't (need) to know the content of the definition. If I am going to add support for detecting dependency I am back to the point that when there are new dependency introduced in ADF, I have to update my tasks to support those.

So I opt to keep it this way and let the developer determine the release sequence by adding eg 01_, 02_ as prefix to files to force a release sequence.

rvvincelli commented 5 years ago

Stumbled upon this as well. Conceptually, being forced into renaming a unit to enforce a specific deploy time behavior is incorrect. But such issues are mitigated by the fact that if the first one goes right i.e. the missing ref gets deployed during the first failing release, then the second finds it and it works.

liprec commented 5 years ago

@rvvincelli just for some extra info: you only have to alter the filenames of the definition, you can keep the ADF object names that same. The tasks will use the file system sorting option to determine the order.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.