Azure / bicep-types-az

Bicep type definitions for ARM resources
MIT License
83 stars 26 forks source link

Second HCI deployment brokes the vm network interface #2246

Open TiTi opened 1 month ago

TiTi commented 1 month ago

Bicep version Bicep CLI version 0.29.47 (132ade51bc)

Describe the bug I'm deploying an Azure Stack HCI vm through bicep. The vm creation worked, the vm is running and reachable (ex: rdp ok). If I re-execute the deployment (az deployment group), with the same code & properties, the vm networking is broken:

If i check the vm properties in hyper-v, the network card disappeared !. However, i still see the nic object in azure portal, and present in "connected devices" of the logical network !

Note that the vm has only 1 network card and we use static ip assignement.

I think it's more an ARM issue than a bicep issue in itself, but didn't know where to report this.

Azure Stack HCI version : 23H2 (brand new cluster)

To Reproduce Steps to reproduce the behavior:

vm.bicep:

metadata description = 'Creates a single hci virtual machine'

// ------------------------------
// Essentials

@description('The subscription id where the HCI cluster is')
param hciSubscriptionId string

@description('The resource group name where the HCI cluster is')
param hciResourceGroupName string

@description('The azure region for all resources')
param location string = resourceGroup().location

@description('The name of the custom location to use for the deployment. This name is specified during the deployment of the Azure Stack HCI cluster and can be found on the Azure Stack HCI cluster resource Overview in the Azure portal.')
param customLocationName string

@description('Tags to add')
param tags object = {}

// ------------------------------
// VM

@description('The VM name')
param name string

@description('The number of vCPUs')
param vCPUCount int

@description('The RAM size in MB')
param memoryMB int

@description('Security Type for the virtual machines')
@allowed([
  'Standard'
  'TrustedLaunch'
])
param securityType string = 'TrustedLaunch'

@description('OS type')
@allowed([
  'Windows'
  'Linux'
])
param osType string = 'Windows'

// ------------------------------
// Credentials

@description('User that will be local admin')
param adminUserName string = 'azadmin'

@description('Password of the local admin')
@secure()
param adminPassword string

// ------------------------------
// Networking

type nicType = {
  @description('The private ip to set, empty = dynamic assignement')
  privateIp: string?
  @description('The hci logical network name')
  lnetName: string
}

@description('Network interfaces properties')
@minLength(1)
param nics nicType[]

// ------------------------------
// Domain join

@description('Domain to join')
param domain string

@description('OU group where virtual machines will be placed in Active Directory')
@allowed([
  'Europe'
  'Middle East Russia Africa'
  'South America'
  'North America'
  'Asia Pacific'
  'DC'
  ''
])
param ouGroup string

@description('Login of user')
param domainAdminUserName string

@description('Password of user')
@secure()
param domainAdminPassword string

// ------------------------------
// Disks

type dataDiskType = {
  size: int // in GB
  dynamic: bool?
}

@description('The data disks properties')
param dataDisks dataDiskType[] = []

// ------------------------------
// Image

@description('Image to use')
param imageName string

@description('Image from marketplace?')
param imageFromMarketPlace bool = true

// ------------------------------
// Extensions

@description('Does this machine contains an sql server?')
param hasSql bool = false

@description('Install Windows Admin Center extension')
param windowsAdminCenter bool = true

// ------------------------------
// Variables

var securityProfileJson = (securityType == 'TrustedLaunch') ? {
  enableTPM: true
  securityType: 'TrustedLaunch'
  uefiSettings: {
    secureBootEnabled: true
  }
} : null

var customLocationId = resourceId(hciSubscriptionId, hciResourceGroupName, 'Microsoft.ExtendedLocation/customLocations', customLocationName)
var subtype = imageFromMarketPlace ? 'marketplaceGalleryImages' : 'galleryImages'
var imageId = resourceId(hciSubscriptionId, hciResourceGroupName, 'Microsoft.AzureStackHCI/${subtype}' , imageName)

var ouPath = (ouGroup == 'DC' || empty(ouGroup)) ? '' : 'OU=Servers,OU=Resources,OU=${ouGroup}, .... snipp .... ,DC=snip'

// ------------------------------
// Resources

// Precreate an Arc Connected Machine with an identity--used for zero-touch onboarding of the Arc VM during deployment
resource hybridvm 'Microsoft.HybridCompute/machines@2023-10-03-preview' = {
  name: name
  location: location
  kind: 'HCI'
  identity: {
    type: 'SystemAssigned'
  }
  tags: tags
}

var lnetsAll = map(nics, nic => nic.lnetName)
var lnetNames = union(lnetsAll, lnetsAll) // unique

resource lnets 'Microsoft.AzureStackHCI/logicalNetworks@2023-09-01-preview' existing = [for lnetName in lnetNames: {
  name: lnetName
  scope: resourceGroup(hciSubscriptionId, hciResourceGroupName)
}]

resource networkInterfaces 'Microsoft.AzureStackHCI/networkInterfaces@2023-09-01-preview' = [for (nic, i) in nics: {
  name: '${name}-nic${length(nics) > 1 ? (i+1) : ''}'
  location: location
  extendedLocation: {
    type: 'CustomLocation'
    name: customLocationId
  }
  properties: {
    // No need to specify dns servers on the nic itself
    // dnsSettings: {
    //   dnsServers: lnets[indexOf(lnetNames, nic.lnetName)].properties.dhcpOptions.dnsServers
    // }
    ipConfigurations: [
      {
        // name: 'ipconfig'
        properties: {
          privateIPAddress: nic.?privateIp
          subnet: {
            // id: resourceId(hciSubscriptionId, hciResourceGroupName, 'Microsoft.AzureStackHCI/logicalnetworks', nic.lnetName)
            id: lnets[indexOf(lnetNames, nic.lnetName)].id
          }
        }
      }
    ]
  }
}]

resource disks 'Microsoft.AzureStackHCI/virtualHardDisks@2023-09-01-preview' = [for (disk, i) in dataDisks: {
  name: '${name}-data-${i + 1}'
  location: location
  extendedLocation: {
    type: 'CustomLocation'
    name: customLocationId
  }
  properties: {
    diskSizeGB: disk.size
    dynamic: disk.?dynamic
    // containerId: uncomment if you want to target a specific CSV/storage path in your HCI cluster
  }
}]

resource vm 'Microsoft.AzureStackHCI/virtualMachineInstances@2023-09-01-preview' = {
  scope: hybridvm
  name: 'default' // value must be 'default' per 2023-09-01-preview
  extendedLocation: {
    type: 'CustomLocation'
    name: customLocationId
  }
  properties: {
    hardwareProfile: {
      vmSize: 'Custom'
      processors: vCPUCount
      memoryMB: memoryMB
      // ### uncomment to use dymamic memory ###
      // dynamicMemoryConfig: {
      //   maximumMemoryMB: memoryMB
      //   minimumMemoryMB: 512
      //   targetMemoryBuffer: 20
      // }
    }
    osProfile: {
      adminUsername: adminUserName
      adminPassword: adminPassword
      computerName: name
      windowsConfiguration: osType == 'Windows' ? {
        provisionVMAgent: true // mocguestagent
        provisionVMConfigAgent: true // azure arc connected machine agent
      } : null
      linuxConfiguration: osType == 'Linux' ? {
        provisionVMAgent: true // mocguestagent
        provisionVMConfigAgent: true // azure arc connected machine agent
      } : null
    }
    storageProfile: {
      imageReference: {
        id: imageId
      }
      dataDisks: [for (dataDisk, i) in dataDisks: {
        id: resourceId('Microsoft.AzureStackHCI/virtualHardDisks', '${name}-data-${i+1}')
      }]
    }
    networkProfile: {
      networkInterfaces: [for (nic, i) in nics: {
        id: networkInterfaces[i].id
      }]
    }
    securityProfile: securityProfileJson
  }
  dependsOn: [
    disks
  ]
}

// Default value of 3 is a combination of NETSETUP_JOIN_DOMAIN (0x00000001) & NETSETUP_ACCT_CREATE (0x00000002)
// i.e. will join the domain and create the account on the domain.
// For more information see https://msdn.microsoft.com/en-us/library/aa392154(v=vs.85).aspx
var joindomain_options = 3

resource joindomain 'Microsoft.HybridCompute/machines/extensions@2023-10-03-preview' = if (!empty(domain) && osType == 'Windows') {
  parent: hybridvm
  name: 'joindomain'
  location: location
  properties: {
    publisher: 'Microsoft.Compute'
    type: 'JsonADDomainExtension'
    typeHandlerVersion: '1.3'
    autoUpgradeMinorVersion: true
    settings: {
      Name: domain
      OUPath: ouPath
      User: domainAdminUserName
      Restart: true
      Options: joindomain_options
    }
    protectedSettings: {
      password: domainAdminPassword
    }
  }
  dependsOn: [vm]
}

// Set connection profile to Private so that winrm (5985) is allowed in firewall
resource netProfile 'Microsoft.HybridCompute/machines/extensions@2023-10-03-preview' = if (empty(domain) && osType == 'Windows') {
  parent: hybridvm
  name: 'setNetProfile'
  location: location
  properties: {
    publisher: 'Microsoft.Compute'
    type: 'CustomScriptExtension'
    typeHandlerVersion: '1.10'
    autoUpgradeMinorVersion: true
    settings: {
      commandToExecute: 'powershell.exe "Set-NetConnectionProfile -NetworkCategory Private -InterfaceAlias Ethernet"'
    }
  }
  dependsOn: [vm]
}

resource wac 'Microsoft.HybridCompute/machines/extensions@2023-10-03-preview' = if (windowsAdminCenter && osType == 'Windows') {
  parent: hybridvm
  name: 'AdminCenter'
  location: location
  properties: {
    publisher: 'Microsoft.AdminCenter'
    type: 'AdminCenter'
    typeHandlerVersion: '0.0'
    autoUpgradeMinorVersion: true
    settings: {
      port: 6516
      salt: guid(resourceGroup().id, name)
    }
  }
  dependsOn: [vm]
}

resource sqlVm 'Microsoft.SqlVirtualMachine/sqlVirtualMachines@2022-08-01-preview' = if (hasSql) {
  name: name
  location: location
  properties: {
    virtualMachineResourceId: vm.id
    // sqlManagement: 'Full'
    sqlServerLicenseType: 'PAYG'
  }
}

// ------------------------------
// Outputs

output vm object = vm

S80807502TEST1.bicepparam:

using './vm.bicep'

param hciSubscriptionId = readEnvironmentVariable('HCISUBSCRIPTIONID')
param hciResourceGroupName = readEnvironmentVariable('HCIRESOURCEGROUPNAME')

param location = 'WestEurope'
param customLocationName = 'CO-Test1'
param tags = {
  App: '0000'
  Project: 'Test'
}
param name = 'S80807502TEST1'
param vCPUCount = 2
param memoryMB = 2048
param dataDisks = []
param nics = [
  {privateIp: '10.7.168.105', lnetName: 'hci-primary-prd-vlan-lnet'}
]
param securityType = 'Standard'
param osType = 'Windows'
param imageName = 'hci-win-datacenter-2022-azure-edition-gui-img'
param imageFromMarketPlace = true
param adminPassword = readEnvironmentVariable('ADMINPASSWORD')
param windowsAdminCenter = false
param domain = '' // do not join domain
param ouGroup = ''
param domainAdminUserName = '
param domainAdminPassword = ''
$subscriptionIdOrName = "<subscription id where the arc object are created>"
$rgName = "<rg name where the arc object are created>"
$deploymentName = "S80807502TEST1"
$bicepparam = "S80807502TEST1.bicepparam"

az account set -s $subscriptionIdOrName

# creation
az deployment group what-if --name $deploymentName --resource-group $rgName --parameters $bicepparam
az deployment group create --name $deploymentName --resource-group $rgName --parameters $bicepparam

# second deployment
az deployment group what-if --name $deploymentName --resource-group $rgName --parameters $bicepparam
az deployment group create --name $deploymentName --resource-group $rgName --parameters $bicepparam

What is happening creation: OK RDP test: OK second deployment: whatif: no change to nic = Microsoft.AzureStackHCI/networkInterfaces/S80807502TEST1-nic and no change to networkProfile inside vm properties deploy: rdp drops after nic resource switched from Created to OK status: rdp ok: image rdp dropped about here: image it's even before the 'default' resource (stack hci vm) appears in the deployment any subsequent task (ex: vm extension) fails due to network broken manual rdp to the vm fails

What should happen second deployment should in fact do nothing vm network should not be broken vm should still have a network interface after deploy vm should be reachable [what-if should show no change]

Additional context detail of the second what-if (or any subsequent what-if):

Note: The result may contain false positive predictions (noise).
You can help us improve the accuracy of the result by opening an issue here: https://aka.ms/WhatIfIssues

Resource and property changes are indicated with these symbols:
  - Delete
  ~ Modify
  = Nochange
  * Ignore

The deployment will update the following scope:

Scope: /subscriptions/<silenced>/resourceGroups/rg-demo

  ~ Microsoft.HybridCompute/machines/S80807502TEST1 [2023-10-03-preview]
    - properties.agentUpgrade:

        enableAutomaticUpgrade: false

    - properties.clientPublicKey: "MIIBCgKCAQEApMWUEOouutaVEZ5cnmCV8pQ7/KOE3G2JOAoi1g1uBVPx1mpBtqh4NJeFHqg+Lz9yQ9Q8OBWfCUeieCCuN1yrpR4O2L94WQ7cqhHE9XcFFHG7a2ohmlpwmutGfQUIVYJvqDrM/CzsBoWN0EK5S/RuBNO4cpfThbYRReMXg0WGU0sn94OiaAEZXjN5gLg6QRn4upFwg0RSp9POfXRnl4CJn5jfeHWZRwmo/kw7/T73BuUrvT2giZsVph69iByDxvJI5K2vVkOKOhLZJnmEPNXAl2f9rtcGr8VMcK/fuYMzURtPHOS8xDd2K3OqURNLDFrs2WoshJF7MZ7Fi/2kmz5iewIDAQAB"
    - properties.licenseProfile:

        esuProfile.licenseAssignmentState: "NotAssigned"

    - properties.mssqlDiscovered: "false"
    - properties.osInstallDate:   "2024-07-31T09:33:39Z"
    - properties.osType:          "windows"
    - properties.serviceStatuses:

        extensionService.startupType:          "automatic"
        extensionService.status:               "running"
        guestConfigurationService.startupType: "automatic"
        guestConfigurationService.status:      "running"

  ~ Microsoft.HybridCompute/machines/S80807502TEST1/extensions/setNetProfile [2023-10-03-preview]
    - properties.enableAutomaticUpgrade:  true
    - properties.instanceView:

        name:               "setNetProfile"
        status.code:        "0"
        status.level:       "Information"
        type:               "CustomScriptExtension"
        typeHandlerVersion: "1.10.17"

    ~ properties.autoUpgradeMinorVersion: false => true
    ~ properties.typeHandlerVersion:      "1.10.17" => "1.10"

  ~ Microsoft.HybridCompute/machines/S80807502TEST1/providers/Microsoft.AzureStackHCI/virtualMachineInstances/default [2023-09-01-preview]
    - properties.storageProfile.osDisk:

        osType: "Windows"

    - properties.storageProfile.vmConfigStoragePathId:           "/subscriptions/<silenced>/resourceGroups/rg-hci/providers/Microsoft.AzureStackHCI/storagecontainers/UserStorage2-dec290398b024f1f94bbdceb9a8352b2"
    ~ properties.securityProfile.uefiSettings.secureBootEnabled: true => false

  = Microsoft.AzureStackHCI/networkInterfaces/S80807502TEST1-nic [2023-09-01-preview]
  * Microsoft.AzureStackHCI/networkInterfaces/S80805504BK3-nic
  * Microsoft.AzureStackHCI/networkInterfaces/S80805504BK4-nic
  * Microsoft.AzureStackHCI/networkInterfaces/S80807502BK1-nic
  * Microsoft.AzureStackHCI/networkInterfaces/S80807502BK2-nic
  * Microsoft.AzureStackHCI/virtualHardDisks/S80805504BK3-data-1
  * Microsoft.AzureStackHCI/virtualHardDisks/S80807502BK1-data-1
  * Microsoft.HybridCompute/machines/S80805504BK3
  * Microsoft.HybridCompute/machines/S80805504BK4
  * Microsoft.HybridCompute/machines/S80807502BK1
  * Microsoft.HybridCompute/machines/S80807502BK2

Resource changes: 3 to modify, 1 no change, 10 to ignore.

Note: the secureBootEnabled: true => false is a bit misleading but has in fact no impact. (Security profile set to null) After the second deploy, we can check with az stack-hci-vm show --name S80807502TEST1 -g rg-demo and it stays:

    "securityProfile": {
      "enableTpm": null,
      "securityType": null,
      "uefiSettings": {
        "secureBootEnabled": true
      }
    },
stephaniezyen commented 1 month ago

This is unfortunately an RP issue. Please open a support ticket and we will try to route it to the right team on our end.