Azure / azure-powershell

Microsoft Azure PowerShell
Other
4.26k stars 3.86k forks source link

"Your Azure credentials have not been set up or have expired, please run Connect-AzAccount to set up your Azure credentials" when passing Azure context to Start-Job #9448

Closed arpparker closed 4 years ago

arpparker commented 5 years ago

Description

This issue is very similar to several previous issues here, here, and here. When passing the current Azure context to the Start-Job command, the first job that completes will often fail with the error message, "Your Azure credentials have not been set up or have expired, please run Connect-AzAccount to set up your Azure credentials". Subsequent commands complete successfully.

Steps to reproduce

Finding a way to consistently reproduce this has nearly drove me mad. I fully realize the steps below may seem oddly specific, but what I've outlined is the only way I've been able to reliably and consistently reproduce the issue. There may very well be a better way to reproduce it (or a way with fewer steps), but this method will work for me every time.

  1. Pass the Azure context being used to a Start-Job command that executes an Az command (a Start-Job similar to what's below for example).
  2. Wait 24 hours.
  3. Restart computer where running PowerShell commands.
  4. Open PowerShell session and clear Azure context (Clear-AzContext). Close PowerShell session.
  5. Open PowerShell session, login to Azure, and set context (Add-AzAccount, Set-AzSubscription).
  6. Run the below script.
$Global:ErrorActionPreference = 'Stop'
$DebugPreference = 'Continue'

$AzContext = Get-AzContext

Start-Job -ArgumentList $AzContext -ScriptBlock {
    param($AzContext)

    $DebugPreference = 'Continue'

    Get-AzVm -AzContext $AzContext
}

Start-Job -ArgumentList $AzContext -ScriptBlock {
    param($AzContext)

    $DebugPreference = 'Continue'

    Get-AzVm -AzContext $AzContext
}
  1. Wait for both jobs to complete. Receive both jobs and observe that one job will fail (whichever ends up getting submitted first) and the other job will succeed.
###############
####RESULTS####
###############

PS C:\Scripts> Get-Job

Id     Name            PSJobTypeName   State         HasMoreData     Location             Command
--     ----            -------------   -----         -----------     --------             -------
1      Job1            BackgroundJob   Completed     True            localhost            ...
3      Job3            BackgroundJob   Completed     True            localhost            ...

Receive-Job 1
#Normal Get-AzVm data is returned

Receive-Job 3
Your Azure credentials have not been set up or have expired, please run Connect-AzAccount to set up your Azure credentials.
At line:1 char:1
+ Receive-Job 3
+ ~~~~~~~~~~~~~
    + CategoryInfo          : CloseError: (:) [Set-AzVMExtension], ArgumentException
    + FullyQualifiedErrorId : Microsoft.Azure.Commands.Compute.SetAzureVMExtensionCommand
    + PSComputerName        : localhost

Environment data

Name                           Value
----                           -----
PSVersion                      5.1.14393.2969
PSEdition                      Desktop
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0...}
BuildVersion                   10.0.14393.2969
CLRVersion                     4.0.30319.42000
WSManStackVersion              3.0
PSRemotingProtocolVersion      2.3
SerializationVersion           1.1.0.1

Module versions

    Directory: C:\Program Files\WindowsPowerShell\Modules

ModuleType Version    Name                                ExportedCommands
---------- -------    ----                                ----------------
Script     1.5.2      Az.Accounts                         {Disable-AzDataCollection, Disable-AzContextAutosave, Enable-AzDataCollection, Enable-AzContextAutosave...}
Script     1.0.1      Az.Aks                              {Get-AzAks, New-AzAks, Remove-AzAks, Import-AzAksCredential...}
Script     1.1.0      Az.AnalysisServices                 {Resume-AzAnalysisServicesServer, Suspend-AzAnalysisServicesServer, Get-AzAnalysisServicesServer, Remove-AzAnaly...
Script     1.1.0      Az.ApiManagement                    {Add-AzApiManagementApiToProduct, Add-AzApiManagementProductToGroup, Add-AzApiManagementRegion, Add-AzApiManagem...
Script     1.0.0      Az.ApplicationInsights              {Get-AzApplicationInsights, New-AzApplicationInsights, Remove-AzApplicationInsights, Set-AzApplicationInsightsPr...
Script     1.2.2      Az.Automation                       {Get-AzAutomationHybridWorkerGroup, Remove-AzAutomationHybridWorkerGroup, Get-AzAutomationJobOutputRecord, Impor...
Script     1.1.0      Az.Batch                            {Remove-AzBatchAccount, Get-AzBatchAccount, Get-AzBatchAccountKey, New-AzBatchAccount...}
Script     1.0.0      Az.Billing                          {Get-AzBillingInvoice, Get-AzBillingPeriod, Get-AzEnrollmentAccount, Get-AzConsumptionBudget...}
Script     1.3.0      Az.Cdn                              {Get-AzCdnProfile, Get-AzCdnProfileSsoUrl, New-AzCdnProfile, Remove-AzCdnProfile...}
Script     1.1.1      Az.CognitiveServices                {Get-AzCognitiveServicesAccount, Get-AzCognitiveServicesAccountKey, Get-AzCognitiveServicesAccountSku, Get-AzCog...
Script     2.2.0      Az.Compute                          {Remove-AzAvailabilitySet, Get-AzAvailabilitySet, New-AzAvailabilitySet, Update-AzAvailabilitySet...}
Script     1.0.1      Az.ContainerInstance                {New-AzContainerGroup, Get-AzContainerGroup, Remove-AzContainerGroup, Get-AzContainerInstanceLog}
Script     1.0.1      Az.ContainerRegistry                {New-AzContainerRegistry, Get-AzContainerRegistry, Update-AzContainerRegistry, Remove-AzContainerRegistry...}
Script     1.1.1      Az.DataFactory                      {Set-AzDataFactoryV2, Update-AzDataFactoryV2, Get-AzDataFactoryV2, Remove-AzDataFactoryV2...}
Script     1.0.0      Az.DataLakeAnalytics                {Get-AzDataLakeAnalyticsDataSource, New-AzDataLakeAnalyticsCatalogCredential, Remove-AzDataLakeAnalyticsCatalogC...
Script     1.2.1      Az.DataLakeStore                    {Get-AzDataLakeStoreTrustedIdProvider, Remove-AzDataLakeStoreTrustedIdProvider, Remove-AzDataLakeStoreFirewallRu...
Script     1.0.0      Az.DeploymentManager                {Get-AzDeploymentManagerArtifactSource, New-AzDeploymentManagerArtifactSource, Set-AzDeploymentManagerArtifactSo...
Script     1.0.0      Az.DevTestLabs                      {Get-AzDtlAllowedVMSizesPolicy, Get-AzDtlAutoShutdownPolicy, Get-AzDtlAutoStartPolicy, Get-AzDtlVMsPerLabPolicy...}
Script     1.1.0      Az.Dns                              {Get-AzDnsRecordSet, New-AzDnsRecordConfig, Remove-AzDnsRecordSet, Set-AzDnsRecordSet...}
Script     1.1.1      Az.EventGrid                        {New-AzEventGridTopic, Get-AzEventGridTopic, Set-AzEventGridTopic, New-AzEventGridTopicKey...}
Script     1.2.0      Az.EventHub                         {New-AzEventHubNamespace, Get-AzEventHubNamespace, Set-AzEventHubNamespace, Remove-AzEventHubNamespace...}
Script     1.0.0      Az.FrontDoor                        {New-AzFrontDoor, Get-AzFrontDoor, Set-AzFrontDoor, Remove-AzFrontDoor...}
Script     2.0.0      Az.HDInsight                        {Get-AzHDInsightJob, New-AzHDInsightSqoopJobDefinition, Wait-AzHDInsightJob, New-AzHDInsightStreamingMapReduceJo...
Script     1.1.0      Az.IotHub                           {Add-AzIotHubKey, Get-AzIotHubEventHubConsumerGroup, Get-AzIotHubConnectionString, Get-AzIotHubJob...}
Script     1.2.0      Az.KeyVault                         {Add-AzKeyVaultCertificate, Update-AzKeyVaultCertificate, Stop-AzKeyVaultCertificateOperation, Get-AzKeyVaultCer...
Script     1.2.1      Az.LogicApp                         {Get-AzIntegrationAccountAgreement, Get-AzIntegrationAccountAssembly, Get-AzIntegrationAccountBatchConfiguration...
Script     1.1.0      Az.MachineLearning                  {Move-AzMlCommitmentAssociation, Get-AzMlCommitmentAssociation, Get-AzMlCommitmentPlanUsageHistory, Remove-AzMlC...
Script     1.0.0      Az.MarketplaceOrdering              {Get-AzMarketplaceTerms, Set-AzMarketplaceTerms}
Script     1.1.0      Az.Media                            {Sync-AzMediaServiceStorageKey, Set-AzMediaServiceKey, Get-AzMediaServiceKey, Get-AzMediaServiceNameAvailability...
Script     1.2.1      Az.Monitor                          {Get-AzMetricDefinition, Get-AzMetric, Remove-AzLogProfile, Get-AzLogProfile...}
Script     1.9.0      Az.Network                          {Add-AzApplicationGatewayAuthenticationCertificate, Get-AzApplicationGatewayAuthenticationCertificate, New-AzApp...
Script     1.1.0      Az.NotificationHubs                 {Get-AzNotificationHub, Get-AzNotificationHubAuthorizationRule, Get-AzNotificationHubListKey, Get-AzNotification...
Script     1.2.0      Az.OperationalInsights              {New-AzOperationalInsightsAzureActivityLogDataSource, New-AzOperationalInsightsCustomLogDataSource, Disable-AzOp...
Script     1.1.1      Az.PolicyInsights                   {Get-AzPolicyEvent, Get-AzPolicyState, Get-AzPolicyStateSummary, Get-AzPolicyRemediation...}
Script     1.1.0      Az.PowerBIEmbedded                  {Remove-AzPowerBIWorkspaceCollection, Get-AzPowerBIWorkspaceCollection, Get-AzPowerBIWorkspaceCollectionAccessKe...
Script     1.4.1      Az.RecoveryServices                 {Get-AzRecoveryServicesBackupProperty, Get-AzRecoveryServicesVault, Get-AzRecoveryServicesVaultSettingsFile, New...
Script     1.1.0      Az.RedisCache                       {Remove-AzRedisCachePatchSchedule, New-AzRedisCacheScheduleEntry, Get-AzRedisCachePatchSchedule, New-AzRedisCach...
Script     1.0.1      Az.Relay                            {New-AzRelayNamespace, Get-AzRelayNamespace, Set-AzRelayNamespace, Remove-AzRelayNamespace...}
Script     1.4.0      Az.Resources                        {Get-AzProviderOperation, Remove-AzRoleAssignment, Get-AzRoleAssignment, New-AzRoleAssignment...}
Script     1.2.0      Az.ServiceBus                       {New-AzServiceBusNamespace, Get-AzServiceBusNamespace, Set-AzServiceBusNamespace, Remove-AzServiceBusNamespace...}
Script     1.1.0      Az.ServiceFabric                    {Add-AzServiceFabricApplicationCertificate, Add-AzServiceFabricClientCertificate, Add-AzServiceFabricClusterCert...
Script     1.0.2      Az.SignalR                          {New-AzSignalR, Get-AzSignalR, Get-AzSignalRKey, New-AzSignalRKey...}
Script     1.11.0     Az.Sql                              {Get-AzSqlDatabaseTransparentDataEncryption, Get-AzSqlDatabaseTransparentDataEncryptionActivity, Set-AzSqlDataba...
Script     1.3.0      Az.Storage                          {Get-AzStorageAccount, Get-AzStorageAccountKey, New-AzStorageAccount, New-AzStorageAccountKey...}
Script     1.0.0      Az.StreamAnalytics                  {Get-AzStreamAnalyticsFunction, Get-AzStreamAnalyticsDefaultFunctionDefinition, New-AzStreamAnalyticsFunction, R...
Script     1.0.1      Az.TrafficManager                   {Add-AzTrafficManagerCustomHeaderToEndpoint, Remove-AzTrafficManagerCustomHeaderFromEndpoint, Add-AzTrafficManag...
Script     1.2.2      Az.Websites                         {Get-AzAppServicePlan, Set-AzAppServicePlan, New-AzAppServicePlan, Remove-AzAppServicePlan...}
Script     1.0.1      Microsoft.PowerShell.Operation.V... {Get-OperationValidation, Invoke-OperationValidation}
Binary     1.0.0.1    PackageManagement                   {Find-Package, Get-Package, Get-PackageProvider, Get-PackageSource...}
Script     3.4.0      Pester                              {Describe, Context, It, Should...}
Script     1.0.0.1    PowerShellGet                       {Install-Module, Find-Module, Save-Module, Update-Module...}
Script     1.2        PSReadline                          {Get-PSReadlineKeyHandler, Set-PSReadlineKeyHandler, Remove-PSReadlineKeyHandler, Get-PSReadlineOption...}

    Directory: C:\Windows\system32\WindowsPowerShell\v1.0\Modules

ModuleType Version    Name                                ExportedCommands
---------- -------    ----                                ----------------
Manifest   1.0.0.0    AppBackgroundTask                   {Disable-AppBackgroundTaskDiagnosticLog, Enable-AppBackgroundTaskDiagnosticLog, Set-AppBackgroundTaskResourcePol...
Manifest   2.0.0.0    AppLocker                           {Get-AppLockerFileInformation, Get-AppLockerPolicy, New-AppLockerPolicy, Set-AppLockerPolicy...}
Manifest   1.0.0.0    AppvClient                          {Add-AppvClientConnectionGroup, Add-AppvClientPackage, Add-AppvPublishingServer, Disable-Appv...}
Manifest   2.0.0.0    Appx                                {Add-AppxPackage, Get-AppxPackage, Get-AppxPackageManifest, Remove-AppxPackage...}
Script     1.0.0.0    AssignedAccess                      {Clear-AssignedAccess, Get-AssignedAccess, Set-AssignedAccess}
Manifest   1.0        BestPractices                       {Get-BpaModel, Get-BpaResult, Invoke-BpaModel, Set-BpaResult}
Manifest   1.0.0.0    BitLocker                           {Unlock-BitLocker, Suspend-BitLocker, Resume-BitLocker, Remove-BitLockerKeyProtector...}
Manifest   2.0.0.0    BitsTransfer                        {Add-BitsFile, Complete-BitsTransfer, Get-BitsTransfer, Remove-BitsTransfer...}
Manifest   1.0.0.0    BranchCache                         {Add-BCDataCacheExtension, Clear-BCCache, Disable-BC, Disable-BCDowngrading...}
Manifest   1.0.0.0    CimCmdlets                          {Get-CimAssociatedInstance, Get-CimClass, Get-CimInstance, Get-CimSession...}
Manifest   1.0        ConfigCI                            {Get-SystemDriver, New-CIPolicyRule, New-CIPolicy, Get-CIPolicy...}
Manifest   1.0        Defender                            {Get-MpPreference, Set-MpPreference, Add-MpPreference, Remove-MpPreference...}
Binary     2.0.0.0    DFSR                                {New-DfsReplicationGroup, Get-DfsReplicationGroup, Set-DfsReplicationGroup, Remove-DfsReplicationGroup...}
Manifest   1.0.0.0    DirectAccessClientComponents        {Disable-DAManualEntryPointSelection, Enable-DAManualEntryPointSelection, Get-DAClientExperienceConfiguration, G...
Script     3.0        Dism                                {Add-AppxProvisionedPackage, Add-WindowsDriver, Add-WindowsCapability, Add-WindowsImage...}
Manifest   1.0.0.0    DnsClient                           {Resolve-DnsName, Clear-DnsClientCache, Get-DnsClient, Get-DnsClientCache...}
Manifest   1.0.0.0    EventTracingManagement              {New-EtwTraceSession, Get-EtwTraceSession, Set-EtwTraceSession, Send-EtwTraceSession...}
Manifest   2.0.0.0    International                       {Get-WinDefaultInputMethodOverride, Set-WinDefaultInputMethodOverride, Get-WinHomeLocation, Set-WinHomeLocation...}
Manifest   1.0.0.0    iSCSI                               {Get-IscsiTargetPortal, New-IscsiTargetPortal, Remove-IscsiTargetPortal, Update-IscsiTargetPortal...}
Manifest   2.0.0.0    IscsiTarget                         {Add-ClusteriSCSITargetServerRole, Add-IscsiVirtualDiskTargetMapping, Checkpoint-IscsiVirtualDisk, Convert-Iscsi...
Script     1.0.0.0    ISE                                 {New-IseSnippet, Import-IseSnippet, Get-IseSnippet}
Manifest   1.0.0.0    Kds                                 {Add-KdsRootKey, Get-KdsRootKey, Test-KdsRootKey, Set-KdsConfiguration...}
Manifest   1.0.1.0    Microsoft.PowerShell.Archive        {Compress-Archive, Expand-Archive}
Manifest   3.0.0.0    Microsoft.PowerShell.Diagnostics    {Get-WinEvent, Get-Counter, Import-Counter, Export-Counter...}
Manifest   3.0.0.0    Microsoft.PowerShell.Host           {Start-Transcript, Stop-Transcript}
Manifest   1.0.0.0    Microsoft.PowerShell.LocalAccounts  {Add-LocalGroupMember, Disable-LocalUser, Enable-LocalUser, Get-LocalGroup...}
Manifest   3.1.0.0    Microsoft.PowerShell.Management     {Add-Content, Clear-Content, Clear-ItemProperty, Join-Path...}
Script     1.0        Microsoft.PowerShell.ODataUtils     Export-ODataEndpointProxy
Manifest   3.0.0.0    Microsoft.PowerShell.Security       {Get-Acl, Set-Acl, Get-PfxCertificate, Get-Credential...}
Manifest   3.1.0.0    Microsoft.PowerShell.Utility        {Format-List, Format-Custom, Format-Table, Format-Wide...}
Manifest   3.0.0.0    Microsoft.WSMan.Management          {Disable-WSManCredSSP, Enable-WSManCredSSP, Get-WSManCredSSP, Set-WSManQuickConfig...}
Manifest   1.0        MMAgent                             {Disable-MMAgent, Enable-MMAgent, Set-MMAgent, Get-MMAgent...}
Manifest   1.0.0.0    MsDtc                               {New-DtcDiagnosticTransaction, Complete-DtcDiagnosticTransaction, Join-DtcDiagnosticResourceManager, Receive-Dtc...
Manifest   2.0.0.0    NetAdapter                          {Disable-NetAdapter, Disable-NetAdapterBinding, Disable-NetAdapterChecksumOffload, Disable-NetAdapterEncapsulate...
Manifest   1.0.0.0    NetConnection                       {Get-NetConnectionProfile, Set-NetConnectionProfile}
Manifest   1.0.0.0    NetEventPacketCapture               {New-NetEventSession, Remove-NetEventSession, Get-NetEventSession, Set-NetEventSession...}
Manifest   2.0.0.0    NetLbfo                             {Add-NetLbfoTeamMember, Add-NetLbfoTeamNic, Get-NetLbfoTeam, Get-NetLbfoTeamMember...}
Manifest   1.0.0.0    NetNat                              {Get-NetNat, Get-NetNatExternalAddress, Get-NetNatStaticMapping, Get-NetNatSession...}
Manifest   2.0.0.0    NetQos                              {Get-NetQosPolicy, Set-NetQosPolicy, Remove-NetQosPolicy, New-NetQosPolicy}
Manifest   2.0.0.0    NetSecurity                         {Get-DAPolicyChange, New-NetIPsecAuthProposal, New-NetIPsecMainModeCryptoProposal, New-NetIPsecQuickModeCryptoPr...
Manifest   1.0.0.0    NetSwitchTeam                       {New-NetSwitchTeam, Remove-NetSwitchTeam, Get-NetSwitchTeam, Rename-NetSwitchTeam...}
Manifest   1.0.0.0    NetTCPIP                            {Get-NetIPAddress, Get-NetIPInterface, Get-NetIPv4Protocol, Get-NetIPv6Protocol...}
Manifest   1.0.0.0    NetworkConnectivityStatus           {Get-DAConnectionStatus, Get-NCSIPolicyConfiguration, Reset-NCSIPolicyConfiguration, Set-NCSIPolicyConfiguration}
Manifest   1.0.0.0    NetworkSwitchManager                {Disable-NetworkSwitchEthernetPort, Enable-NetworkSwitchEthernetPort, Get-NetworkSwitchEthernetPort, Remove-Netw...
Manifest   1.0.0.0    NetworkTransition                   {Add-NetIPHttpsCertBinding, Disable-NetDnsTransitionConfiguration, Disable-NetIPHttpsProfile, Disable-NetNatTran...
Manifest   1.0        NFS                                 {Get-NfsMappedIdentity, Get-NfsNetgroup, Install-NfsMappingStore, New-NfsMappedIdentity...}
Manifest   1.0.0.0    PcsvDevice                          {Get-PcsvDevice, Start-PcsvDevice, Stop-PcsvDevice, Restart-PcsvDevice...}
Manifest   1.0.0.0    PKI                                 {Add-CertificateEnrollmentPolicyServer, Export-Certificate, Export-PfxCertificate, Get-CertificateAutoEnrollment...
Manifest   1.0.0.0    PlatformIdentifier                  Get-PlatformIdentifier
Manifest   1.0.0.0    PnpDevice                           {Get-PnpDevice, Get-PnpDeviceProperty, Enable-PnpDevice, Disable-PnpDevice}
Manifest   1.1        PrintManagement                     {Add-Printer, Add-PrinterDriver, Add-PrinterPort, Get-PrintConfiguration...}
Manifest   1.1        PSDesiredStateConfiguration         {Set-DscLocalConfigurationManager, Start-DscConfiguration, Test-DscConfiguration, Publish-DscConfiguration...}
Script     1.0.0.0    PSDiagnostics                       {Disable-PSTrace, Disable-PSWSManCombinedTrace, Disable-WSManTrace, Enable-PSTrace...}
Binary     1.1.0.0    PSScheduledJob                      {New-JobTrigger, Add-JobTrigger, Remove-JobTrigger, Get-JobTrigger...}
Manifest   2.0.0.0    PSWorkflow                          {New-PSWorkflowExecutionOption, New-PSWorkflowSession, nwsn}
Manifest   1.0.0.0    PSWorkflowUtility                   Invoke-AsWorkflow
Manifest   2.0.0.0    RemoteDesktop                       {Get-RDCertificate, Set-RDCertificate, New-RDCertificate, New-RDVirtualDesktopDeployment...}
Manifest   1.0.0.0    ScheduledTasks                      {Get-ScheduledTask, Set-ScheduledTask, Register-ScheduledTask, Unregister-ScheduledTask...}
Manifest   2.0.0.0    SecureBoot                          {Confirm-SecureBootUEFI, Set-SecureBootUEFI, Get-SecureBootUEFI, Format-SecureBootUEFI...}
Manifest   1.0.0.0    SecurityCmdlets                     {Backup-SecurityPolicy, Restore-SecurityPolicy, Backup-AuditPolicy, Restore-AuditPolicy}
Script     1.0.0.0    ServerCore                          {Get-DisplayResolution, Set-DisplayResolution}
Script     2.0.0.0    ServerManager                       {Get-WindowsFeature, Install-WindowsFeature, Uninstall-WindowsFeature, Enable-ServerManagerStandardUserRemoting...}
Cim        1.0.0.0    ServerManagerTasks                  {Get-SMCounterSample, Get-SMPerformanceCollector, Start-SMPerformanceCollector, Stop-SMPerformanceCollector...}
Manifest   2.0.0.0    SmbShare                            {Get-SmbShare, Remove-SmbShare, Set-SmbShare, Block-SmbShareAccess...}
Manifest   2.0.0.0    SmbWitness                          {Get-SmbWitnessClient, Move-SmbWitnessClient, gsmbw, msmbw...}
Manifest   2.0.0.0    SoftwareInventoryLogging            {Get-SilComputer, Get-SilComputerIdentity, Get-SilSoftware, Get-SilWindowsUpdate...}
Manifest   1.0.0.0    StartLayout                         {Export-StartLayout, Import-StartLayout, Get-StartApps}
Manifest   2.0.0.0    Storage                             {Add-InitiatorIdToMaskingSet, Add-PartitionAccessPath, Add-PhysicalDisk, Add-TargetPortToMaskingSet...}
Manifest   2.0.0.0    TLS                                 {New-TlsSessionTicketKey, Enable-TlsSessionTicketKey, Disable-TlsSessionTicketKey, Export-TlsSessionTicketKey...}
Manifest   1.0.0.0    TroubleshootingPack                 {Get-TroubleshootingPack, Invoke-TroubleshootingPack}
Manifest   2.0.0.0    TrustedPlatformModule               {Get-Tpm, Initialize-Tpm, Clear-Tpm, Unblock-Tpm...}
Binary     2.1.639.0  UEV                                 {Clear-UevConfiguration, Clear-UevAppxPackage, Restore-UevBackup, Set-UevTemplateProfile...}
Manifest   1.0.0.0    UserAccessLogging                   {Enable-Ual, Disable-Ual, Get-Ual, Get-UalDns...}
Manifest   2.0.0.0    VpnClient                           {Add-VpnConnection, Set-VpnConnection, Remove-VpnConnection, Get-VpnConnection...}
Manifest   1.0.0.0    Wdac                                {Get-OdbcDriver, Set-OdbcDriver, Get-OdbcDsn, Add-OdbcDsn...}
Manifest   2.0.0.0    Whea                                {Get-WheaMemoryPolicy, Set-WheaMemoryPolicy}
Manifest   1.0.0.0    WindowsDeveloperLicense             {Get-WindowsDeveloperLicense, Unregister-WindowsDeveloperLicense, Show-WindowsDeveloperLicenseRegistration}
Script     1.0        WindowsErrorReporting               {Enable-WindowsErrorReporting, Disable-WindowsErrorReporting, Get-WindowsErrorReporting}
Manifest   1.0.0.0    WindowsSearch                       {Get-WindowsSearchSetting, Set-WindowsSearchSetting}
Manifest   1.0.0.0    WindowsUpdate                       Get-WindowsUpdateLog

Debug output

NOTE Any potentially private information has been blanked out with 'xxx'. If any thing that was blanked out is needed, please contact me privately.

DEBUG: 8:27:06 AM - GetAzureVMCommand begin processing with ParameterSet 'DefaultParamSet'.
DEBUG: 8:27:06 AM - using account id 'xxx'...
DEBUG: [Common.Authentication]: Authenticating using Account: 'xxx', environment: 'AzureCloud', tenant: 'xxx'
DEBUG: [Common.Authentication]: Authenticating using configuration values: Domain: 'xxx', Endpoint:
'https://login.microsoftonline.com/', ClientId: 'xxx', ClientRedirect: 'urn:ietf:wg:oauth:2.0:oob', ResourceClientUri:
'https://management.core.windows.net/', ValidateAuthority: 'True'
DEBUG: [Common.Authentication]: Acquiring token using context with Authority 'https://login.microsoftonline.com/xxx/', CorrelationId:
'00000000-0000-0000-0000-000000000000', ValidateAuthority: 'True'
DEBUG: [Common.Authentication]: Acquiring token using AdalConfiguration with Domain: 'xxx', AdEndpoint:
'https://login.microsoftonline.com/', ClientId: 'xxx', ClientRedirectUri: urn:ietf:wg:oauth:2.0:oob
DEBUG: [ADAL]: Information: 2019-06-18T12:27:06.7061450Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: ADAL PCL.Desktop with assembly version '3.19.2.6005',
file version '3.19.50302.0130' and informational version '2a8bec6c4c76d0c1ef819b55bdc3cda2d2605056' is running...
DEBUG: [ADAL]: Information: 2019-06-18T12:27:06.7061450Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: ADAL PCL.Desktop with assembly version '3.19.2.6005',
file version '3.19.50302.0130' and informational version '2a8bec6c4c76d0c1ef819b55bdc3cda2d2605056' is running...
DEBUG: [ADAL]: Information: 2019-06-18T12:27:06.7061450Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: === Token Acquisition started:
 CacheType: null
 Authentication Target: User
 , Authority Host: login.microsoftonline.com
DEBUG: [ADAL]: Information: 2019-06-18T12:27:06.7061450Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: === Token Acquisition started:
 Authority: https://login.microsoftonline.com/xxx/
 Resource: https://management.core.windows.net/
 ClientId: xxx
 CacheType: null
 Authentication Target: User

DEBUG: [ADAL]: Verbose: 2019-06-18T12:27:06.7998850Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: Loading from cache.
DEBUG: [ADAL]: Verbose: 2019-06-18T12:27:06.7998850Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: Loading from cache.
DEBUG: [ADAL]: Verbose: 2019-06-18T12:27:06.8155456Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: Looking up cache for a token...
DEBUG: [ADAL]: Verbose: 2019-06-18T12:27:06.8155456Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: Looking up cache for a token...
DEBUG: [ADAL]: Information: 2019-06-18T12:27:06.8155456Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: No matching token was found in the cache
DEBUG: [ADAL]: Information: 2019-06-18T12:27:06.8155456Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: No matching token was found in the cache
DEBUG: [ADAL]: Verbose: 2019-06-18T12:27:06.8155456Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: Looking up cache for a token...
DEBUG: [ADAL]: Verbose: 2019-06-18T12:27:06.8155456Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: Looking up cache for a token...
DEBUG: [ADAL]: Information: 2019-06-18T12:27:06.8155456Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: No matching token was found in the cache
DEBUG: [ADAL]: Information: 2019-06-18T12:27:06.8155456Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: No matching token was found in the cache
DEBUG: [ADAL]: Verbose: 2019-06-18T12:27:06.8155456Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: Looking up cache for a token...
DEBUG: [ADAL]: Verbose: 2019-06-18T12:27:06.8155456Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: Looking up cache for a token...
DEBUG: [ADAL]: Information: 2019-06-18T12:27:06.8155456Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: No matching token was found in the cache
DEBUG: [ADAL]: Information: 2019-06-18T12:27:06.8155456Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: No matching token was found in the cache
DEBUG: [ADAL]: Verbose: 2019-06-18T12:27:06.8155456Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: Looking up cache for a token...
DEBUG: [ADAL]: Verbose: 2019-06-18T12:27:06.8155456Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: Looking up cache for a token...
DEBUG: [ADAL]: Information: 2019-06-18T12:27:06.8155456Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: No matching token was found in the cache
DEBUG: [ADAL]: Information: 2019-06-18T12:27:06.8155456Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: No matching token was found in the cache
DEBUG: [ADAL]: Verbose: 2019-06-18T12:27:06.8155456Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: Looking up cache for a token...
DEBUG: [ADAL]: Verbose: 2019-06-18T12:27:06.8155456Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: Looking up cache for a token...
DEBUG: [ADAL]: Information: 2019-06-18T12:27:06.8155456Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: No matching token was found in the cache
DEBUG: [ADAL]: Information: 2019-06-18T12:27:06.8155456Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: No matching token was found in the cache
DEBUG: [ADAL]: Verbose: 2019-06-18T12:27:06.8155456Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: Looking up cache for a token...
DEBUG: [ADAL]: Verbose: 2019-06-18T12:27:06.8155456Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: Looking up cache for a token...
DEBUG: [ADAL]: Information: 2019-06-18T12:27:06.8155456Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: No matching token was found in the cache
DEBUG: [ADAL]: Information: 2019-06-18T12:27:06.8155456Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: No matching token was found in the cache
DEBUG: [ADAL]: Verbose: 2019-06-18T12:27:06.8155456Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: No token matching arguments found in the cache
DEBUG: [ADAL]: Verbose: 2019-06-18T12:27:06.8155456Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: No token matching arguments found in the cache
DEBUG: [ADAL]: Error: 2019-06-18T12:27:06.8155456Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs: Exception type:
Microsoft.IdentityModel.Clients.ActiveDirectory.AdalSilentTokenAcquisitionException, ErrorCode: failed_to_acquire_token_silently
   at Microsoft.IdentityModel.Clients.ActiveDirectory.Internal.Flows.AcquireTokenSilentHandler.SendTokenRequestAsync()
   at Microsoft.IdentityModel.Clients.ActiveDirectory.Internal.Flows.AcquireTokenHandlerBase.<CheckAndAcquireTokenUsingBrokerAsync>d__59.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.IdentityModel.Clients.ActiveDirectory.Internal.Flows.AcquireTokenHandlerBase.<RunAsync>d__57.MoveNext()
DEBUG: [ADAL]: Error: 2019-06-18T12:27:06.8155456Z: 5a1f9fe1-e013-4740-ab7b-67d764c79c23 - LoggerBase.cs:
Microsoft.IdentityModel.Clients.ActiveDirectory.AdalSilentTokenAcquisitionException: Failed to acquire token silently as no token was found in the cache. Call method
AcquireToken
   at Microsoft.IdentityModel.Clients.ActiveDirectory.Internal.Flows.AcquireTokenSilentHandler.SendTokenRequestAsync()
   at Microsoft.IdentityModel.Clients.ActiveDirectory.Internal.Flows.AcquireTokenHandlerBase.<CheckAndAcquireTokenUsingBrokerAsync>d__59.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.IdentityModel.Clients.ActiveDirectory.Internal.Flows.AcquireTokenHandlerBase.<RunAsync>d__57.MoveNext()
 ErrorCode: failed_to_acquire_token_silently
DEBUG: [Common.Authentication]: Received exception Failed to acquire token silently as no token was found in the cache. Call method AcquireToken, while authenticating.
Your Azure credentials have not been set up or have expired, please run Connect-AzAccount to set up your Azure credentials.
At line:1 char:1
+ Receive-Job 3
+ ~~~~~~~~~~~~~
    + CategoryInfo          : CloseError: (:) [Get-AzVM], ArgumentException
    + FullyQualifiedErrorId : Microsoft.Azure.Commands.Compute.GetAzureVMCommand
    + PSComputerName        : localhost

Error output


   HistoryId: -1

Message        : Your Azure credentials have not been set up or have expired, please run Connect-AzAccount to set up your Azure credentials.
StackTrace     :
Exception      : System.Management.Automation.RemoteException
InvocationInfo : {}
Line           : Receive-Job 3
Position       : At line:1 char:1
                 + Receive-Job 3
                 + ~~~~~~~~~~~~~
HistoryId      : -1
spaelling commented 5 years ago

I am unable to reproduce this, but I did experience this some time ago (maybe a year), but found a workaround, using -DefaultProfile, (I turned off some of the debug output, and changed it so that I only get an error back (if any)

This was mostly in Azure Function App that uses runspaces, so your mileage may vary.

I actually just yesterday wanted to write a blogpost on this and found that I did not need to use -DefaultProfile anymore. I did a loop of 50 iterations queuing 5 jobs in each, and got this error 2-5 times each.

I worked with runspaces the entire day and did not see this error once.

Try below out and see if that makes a difference.

$Global:ErrorActionPreference = 'Stop'
$DebugPreference = 'SilentlyContinue'

$AzContext = Get-AzContext

$Jobs = @()

$Jobs += Start-Job -ArgumentList $AzContext -ScriptBlock {
    param($AzContext)

    # $DebugPreference = 'Continue'

    try {
        $null = Get-AzVm -DefaultProfile $AzContext
    }
    catch {
        $_
    }
}

$Jobs += Start-Job -ArgumentList $AzContext -ScriptBlock {
    param($AzContext)

    # $DebugPreference = 'Continue'

    try {
        $null = Get-AzVm -DefaultProfile $AzContext
    }
    catch {
        $_
    }
}

$Jobs | Wait-Job | Receive-Job
arpparker commented 5 years ago

@spaelling I actually am using -DefaultProfile (technically -AzContext, but that is an alias for -DefaultProfile) in the actual script that I observed this issue on and it still fails in the same way. I can consistently reproduce this issue both in my actual production script and in my example script above. My steps for reproduction are just a simpler way to illustrate what I'm doing in the actual script (which is using Set-AzVmExtension to join a VM to a domain).

EDIT: Looking at this again, since I am using -DefaultProfile in my actual script, it was an oversight leaving it out in my reproduction steps above. So much so, it's pointless passing in $AzContext if I'm not referencing it anywhere! :) I'm going to try reproducing this again using -DefaultProfile, but I suspect the results will be the same since I can reproduce the same issue in my actual production script (which already uses -DefaultProfile).

EDIT2: I've added -AzContext to the Get-AzVm commands in the reproduction steps above.

spaelling commented 5 years ago

@spaelling I actually am using -DefaultProfile (technically -AzContext, but that is an alias for -DefaultProfile) in the actual script that I observed this issue on and it still fails in the same way. I can consistently reproduce this issue both in my actual production script and in my example script above. My steps for reproduction are just a simpler way to illustrate what I'm doing in the actual script (which is using Set-AzVmExtension to join a VM to a domain).

EDIT: Looking at this again, since I am using -DefaultProfile in my actual script, it was an oversight leaving it out in my reproduction steps above. So much so, it's pointless passing in $AzContext if I'm not referencing it anywhere! :) I'm going to try reproducing this again using -DefaultProfile, but I suspect the results will be the same since I can reproduce the same issue in my actual production script (which already uses -DefaultProfile).

EDIT2: I've added -AzContext to the Get-AzVm commands in the reproduction steps above.

I would assume that if -DefaultProfile is not explicitely specified in the call it will do the equivalent of Get-AzContext, which may be what then fails, ie. no context is available inside the job. But again, I am seeing the same as you, although rarely (<2% of calls) when using Start-Job or equivalent.

Maybe someone has an idea on how to troubleshoot this. Can you reproduce on a vanilla VM (some Windows image from Azure marketplace)

petehauge commented 5 years ago

I have a repro for this I think... Here's my code: $scriptBlock = { $jobs = @() for ($i = 0; $i -lt 10; $i++) { $jobs += Start-Job -ScriptBlock { $rg = $(Get-AzResourceGroup).Count if (-not $rg) { Write-Error "Hit an issue..." } else { Write-Output "No problem..." } } } if($jobs.Count -ne 0) { Write-Output "Waiting for $($jobs.Count) test runner jobs to complete" foreach ($job in $jobs){ $result = Receive-Job $job -Wait Write-Output $result } Remove-Job -Job $jobs } } $jobs = @() for ($i = 0; $i -lt 5; $i++) { $jobs += Start-Job -ScriptBlock $scriptBlock } if($jobs.Count -ne 0) { Write-Output "Waiting for $($jobs.Count) test runner jobs to complete" foreach ($job in $jobs){ $result = Receive-Job $job -Wait Write-Output $result } Remove-Job -Job $jobs }

When I run that code - I get the following results:

image

It basically means that there is some intermittent issue in retrieving the profile. I'm seeing this running Pester tests in parallel (with Start-Job) that rely on the Az module. Anytime I use "-AsJob" in a commandlet I see intermittent failures.

petehauge commented 5 years ago

Can't seem to get my code formatted correctly in the prior comment, so attaching it here... Start-JobAzIssue.ps1.txt

spaelling commented 5 years ago

Put the code within ```powershell code here ```

$scriptBlock = {
    $jobs = @()
    for ($i = 0; $i -lt 10; $i++) {
        $jobs += Start-Job -ScriptBlock {
            $rg = $(Get-AzResourceGroup).Count
            if (-not $rg) {
                Write-Error "Hit an issue..."
            }
            else {
                Write-Output "No problem..."
            }
        }
    }

    if($jobs.Count -ne 0)
    {
        Write-Output "Waiting for $($jobs.Count) test runner jobs to complete"
        foreach ($job in $jobs){
            $result = Receive-Job $job -Wait
            Write-Output $result
        }
        Remove-Job -Job $jobs
    }
}

$jobs = @()

for ($i = 0; $i -lt 5; $i++) {
    $jobs += Start-Job -ScriptBlock $scriptBlock
}

if($jobs.Count -ne 0)
{
    Write-Output "Waiting for $($jobs.Count) test runner jobs to complete"
    foreach ($job in $jobs){
        $result = Receive-Job $job -Wait
        Write-Output $result
    }
    Remove-Job -Job $jobs
}

So what you are doing is basically this

(1..5 | % {Start-Job -ScriptBlock {
    (1..10 | % {Start-Job -ScriptBlock {
        $null = Get-AzResourceGroup -ErrorAction Stop
    }}) | Wait-Job | Receive-Job
}}) | Wait-Job | Receive-Job

That is nesting jobs in jobs. Above is testing the same 50 times, so that fits fairly well with my observed error rate of 2%, ie. 1 in 50 will fail.

spaelling commented 5 years ago

You can get a debug trace from when it fails like this

(1..5 | % {Start-Job -ScriptBlock {
    (1..10 | % {Start-Job -ScriptBlock {
        $DebugPreference = 'Continue'
        $Path = "$($env:TEMP)\20062019_$([guid]::NewGuid().Guid).txt"
        try {
            $null = Get-AzResourceGroup -ErrorAction Stop 5>&1 > $Path
            # Remove file if it did not fail
            Remove-Item $Path
        }
        catch {
            Write-Host "Failed, written debug to $Path. Error was $_"
            notepad $Path
        }
        $DebugPreference = 'SilentlyContinue'
    }}) | Wait-Job | Receive-Job
}}) | Wait-Job | Receive-Job

Maybe it is just me, but it seems to fail more often when done like this. The debug trace I am getting is this

7:48:28 AM - GetAzureResourceGroupCmdlet begin processing with ParameterSet 'GetByResourceGroupName'.
7:48:28 AM - using account id '*******************'...
[Common.Authentication]: Authenticating using Account: '*******************', environment: 'AzureCloud', tenant: '************************'
[Common.Authentication]: Authenticating using configuration values: Domain: '************************', Endpoint: 
'https://login.microsoftonline.com/', ClientId: '1950a258-227b-4e31-a9cf-717495945fc2', ClientRedirect: 'urn:ietf:wg:oauth:2.0:oob', ResourceClientUri: 
'https://management.core.windows.net/', ValidateAuthority: 'True'
[Common.Authentication]: Acquiring token using context with Authority 'https://login.microsoftonline.com/************************/', 
CorrelationId: '00000000-0000-0000-0000-000000000000', ValidateAuthority: 'True'
[Common.Authentication]: Acquiring token using AdalConfiguration with Domain: '************************', AdEndpoint: 
'https://login.microsoftonline.com/', ClientId: '1950a258-227b-4e31-a9cf-717495945fc2', ClientRedirectUri: urn:ietf:wg:oauth:2.0:oob
[ADAL]: Information: 2019-06-20T05:48:28.9565452Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: ADAL PCL.Desktop with assembly version '3.19.2.6005', 
file version '3.19.50302.0130' and informational version '2a8bec6c4c76d0c1ef819b55bdc3cda2d2605056' is running...9

[ADAL]: Information: 2019-06-20T05:48:28.9565452Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: ADAL PCL.Desktop with assembly version '3.19.2.6005', 
file version '3.19.50302.0130' and informational version '2a8bec6c4c76d0c1ef819b55bdc3cda2d2605056' is running...

[ADAL]: Information: 2019-06-20T05:48:28.9575461Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: === Token Acquisition started: 
    CacheType: null
    Authentication Target: User
    , Authority Host: login.microsoftonline.com

[ADAL]: Information: 2019-06-20T05:48:28.9575461Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: === Token Acquisition started:
    Authority: https://login.microsoftonline.com/************************/
    Resource: https://management.core.windows.net/
    ClientId: 1950a258-227b-4e31-a9cf-717495945fc2
    CacheType: null
    Authentication Target: User

[ADAL]: Verbose: 2019-06-20T05:48:30.6175446Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: Loading from cache.

[ADAL]: Verbose: 2019-06-20T05:48:30.6175446Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: Loading from cache.

[ADAL]: Verbose: 2019-06-20T05:48:30.6435470Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: Looking up cache for a token...

[ADAL]: Verbose: 2019-06-20T05:48:30.6445464Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: Looking up cache for a token...

[ADAL]: Information: 2019-06-20T05:48:30.7415426Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: No matching token was found in the cache

[ADAL]: Information: 2019-06-20T05:48:30.7415426Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: No matching token was found in the cache

[ADAL]: Verbose: 2019-06-20T05:48:30.7415426Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: Looking up cache for a token...

[ADAL]: Verbose: 2019-06-20T05:48:30.7415426Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: Looking up cache for a token...

[ADAL]: Information: 2019-06-20T05:48:30.7425438Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: No matching token was found in the cache

[ADAL]: Information: 2019-06-20T05:48:30.7425438Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: No matching token was found in the cache

[ADAL]: Verbose: 2019-06-20T05:48:30.7425438Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: Looking up cache for a token...

[ADAL]: Verbose: 2019-06-20T05:48:30.7425438Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: Looking up cache for a token...

[ADAL]: Information: 2019-06-20T05:48:30.7425438Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: No matching token was found in the cache

[ADAL]: Information: 2019-06-20T05:48:30.7425438Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: No matching token was found in the cache

[ADAL]: Verbose: 2019-06-20T05:48:30.7475447Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: Looking up cache for a token...

[ADAL]: Verbose: 2019-06-20T05:48:30.7475447Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: Looking up cache for a token...

[ADAL]: Information: 2019-06-20T05:48:30.7475447Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: No matching token was found in the cache

[ADAL]: Information: 2019-06-20T05:48:30.7475447Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: No matching token was found in the cache

[ADAL]: Verbose: 2019-06-20T05:48:30.7475447Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: Looking up cache for a token...

[ADAL]: Verbose: 2019-06-20T05:48:30.7475447Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: Looking up cache for a token...

[ADAL]: Information: 2019-06-20T05:48:30.7475447Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: No matching token was found in the cache

[ADAL]: Information: 2019-06-20T05:48:30.7485438Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: No matching token was found in the cache

[ADAL]: Verbose: 2019-06-20T05:48:30.7485438Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: Looking up cache for a token...

[ADAL]: Verbose: 2019-06-20T05:48:30.7485438Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: Looking up cache for a token...

[ADAL]: Information: 2019-06-20T05:48:30.7485438Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: No matching token was found in the cache

[ADAL]: Information: 2019-06-20T05:48:30.7485438Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: No matching token was found in the cache

[ADAL]: Verbose: 2019-06-20T05:48:30.7615453Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: No token matching arguments found in the cache

[ADAL]: Verbose: 2019-06-20T05:48:30.7625477Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: No token matching arguments found in the cache

[ADAL]: Error: 2019-06-20T05:48:30.7725451Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: Exception type: 
Microsoft.IdentityModel.Clients.ActiveDirectory.AdalSilentTokenAcquisitionException, ErrorCode: failed_to_acquire_token_silently
   at Microsoft.IdentityModel.Clients.ActiveDirectory.Internal.Flows.AcquireTokenSilentHandler.SendTokenRequestAsync()
   at Microsoft.IdentityModel.Clients.ActiveDirectory.Internal.Flows.AcquireTokenHandlerBase.<CheckAndAcquireTokenUsingBrokerAsync>d__59.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.IdentityModel.Clients.ActiveDirectory.Internal.Flows.AcquireTokenHandlerBase.<RunAsync>d__57.MoveNext()

[ADAL]: Error: 2019-06-20T05:48:30.7725451Z: 45b5b5a9-4b53-4036-94a8-d695213153bb - LoggerBase.cs: 
Microsoft.IdentityModel.Clients.ActiveDirectory.AdalSilentTokenAcquisitionException: Failed to acquire token silently as no token was found in the cache. Call 
method AcquireToken
   at Microsoft.IdentityModel.Clients.ActiveDirectory.Internal.Flows.AcquireTokenSilentHandler.SendTokenRequestAsync()
   at Microsoft.IdentityModel.Clients.ActiveDirectory.Internal.Flows.AcquireTokenHandlerBase.<CheckAndAcquireTokenUsingBrokerAsync>d__59.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.IdentityModel.Clients.ActiveDirectory.Internal.Flows.AcquireTokenHandlerBase.<RunAsync>d__57.MoveNext()
    ErrorCode: failed_to_acquire_token_silently

[Common.Authentication]: Received exception Failed to acquire token silently as no token was found in the cache. Call method AcquireToken, while 
authenticating.
arpparker commented 5 years ago

Can't seem to get my code formatted correctly in the prior comment, so attaching it here... Start-JobAzIssue.ps1.txt

This is fantastic, I can reproduce the issue with this consistently as well. I did modify it slightly to include passing the context as a parameter to the job, and then passing the context to the Get-AzResourceGroup command. This matches how I was encountering the issue originally, but I can get it to fail both ways, so it may not be important. I've included the full code below with my changes for reference.

But overall, this is absolutely perfect--this is definitely the best way to reproduce this issue at this point. Seems my suspicion it was time-related is unfounded. I do find it interesting though that I would generally encounter it on the first job of the day after having not run anything for 24-48 hours. Seems immaterial now, but I do find it odd.

Should I replace my reproduction steps in the original post with the code @petehauge has provided?

As referenced above, here's the full code with my small modifications:

$AzContext = Get-AzContext
$scriptBlock = {
    $jobs = @()
    for ($i = 0; $i -lt 10; $i++) {
        $jobs += Start-Job -ArgumentList $AzContext -ScriptBlock {
            param($AzContext)
            $rg = $(Get-AzResourceGroup -AzContext $AzContext).Count
            if (-not $rg) {
                Write-Error "Hit an issue..."
            }
            else {
                Write-Output "No problem..."
            }
        }
    }

    if($jobs.Count -ne 0)
    {
        Write-Output "Waiting for $($jobs.Count) test runner jobs to complete"
        foreach ($job in $jobs){
            $result = Receive-Job $job -Wait
            Write-Output $result
        }
        Remove-Job -Job $jobs
    }
}

$jobs = @()

for ($i = 0; $i -lt 5; $i++) {
    $jobs += Start-Job -ScriptBlock $scriptBlock
}

if($jobs.Count -ne 0)
{
    Write-Output "Waiting for $($jobs.Count) test runner jobs to complete"
    foreach ($job in $jobs){
        $result = Receive-Job $job -Wait
        Write-Output $result
    }
    Remove-Job -Job $jobs
}
spaelling commented 5 years ago

Can't seem to get my code formatted correctly in the prior comment, so attaching it here... Start-JobAzIssue.ps1.txt

This is fantastic, I can reproduce the issue with this consistently as well. I did modify it slightly to include passing the context as a parameter to the job, and then passing the context to the Get-AzResourceGroup command. This matches how I was encountering the issue originally, but I can get it to fail both ways, so it may not be important. I've included the full code below with my changes for reference.

But overall, this is absolutely perfect--this is definitely the best way to reproduce this issue at this point. Seems my suspicion it was time-related is unfounded. I do find it interesting though that I would generally encounter it on the first job of the day after having not run anything for 24-48 hours. Seems immaterial now, but I do find it odd.

Should I replace my reproduction steps in the original post with the code @petehauge has provided?

As referenced above, here's the full code with my small modifications:

$AzContext = Get-AzContext
$scriptBlock = {
    $jobs = @()
    for ($i = 0; $i -lt 10; $i++) {
        $jobs += Start-Job -ArgumentList $AzContext -ScriptBlock {
            param($AzContext)
            $rg = $(Get-AzResourceGroup -AzContext $AzContext).Count
            if (-not $rg) {
                Write-Error "Hit an issue..."
            }
            else {
                Write-Output "No problem..."
            }
        }
    }

    if($jobs.Count -ne 0)
    {
        Write-Output "Waiting for $($jobs.Count) test runner jobs to complete"
        foreach ($job in $jobs){
            $result = Receive-Job $job -Wait
            Write-Output $result
        }
        Remove-Job -Job $jobs
    }
}

$jobs = @()

for ($i = 0; $i -lt 5; $i++) {
    $jobs += Start-Job -ScriptBlock $scriptBlock
}

if($jobs.Count -ne 0)
{
    Write-Output "Waiting for $($jobs.Count) test runner jobs to complete"
    foreach ($job in $jobs){
        $result = Receive-Job $job -Wait
        Write-Output $result
    }
    Remove-Job -Job $jobs
}

The code I provided shows the debug trace, but they will read the entire thread and work from that. Maybe @markcowl or @cormacpayne can comment on this?

And just to repeat a comment I made earlier, this seems not to be a problem when using runspaces, but perhaps similar testing should be done before clearing runspaces entirely.

bingbing8 commented 5 years ago

I see the repro randomly in my code. We connect-AzAccount from powershell azure funcitons. Here are what I am doing in our ps azure function:

  1. Invoke Disable-AzContextAutosave -Scope Process before doing anything with Az. In Functions, profile.ps1.
  2. Invoke Connect-AzAccount with subscriptionID and save the result in a local variable in ps azure function.
  3. Pass the object returned from above step 2 to every further invocation of Az cmdlets as a DefaultProfile (alias AzureRmContext) parameter.

We randomly see below error at step 3. The error is gone when rerun the function.

Your Azure credentials have not been set up or have expired, please run Connect-AzAccount to set up your Azure credentials

Before it get fixed, is there any workaround for this error?

markcowl commented 5 years ago

@bingbing8 @spaelling @arpparker The likely culprit here is an issue with the type converter in the job. The type converter is used in this case because the type of the cmdlet parameter is IAzureContextContainer, rather than IAzureContext

To work around the issue, you can pass in an IAzureContextContainer. In a tpical azure environment, this would mean passint the results of running Connect-AzAccount rather than of running Get-AzContext

so

$context = Connect-AzAccount -Subscription "My Subscription" -Tenant xxxx-xxxxxx-xxxxxx-yyyyy
$job = Start-Job -ArgumentList $context -ScriptBlock {param($AzContext) Get-AzVm -AzContext $AzContext ...}
bingbing8 commented 5 years ago

@markcowl, below is the code we run. I didn't pass in the result of Get-AzContext. It fails randomly.

$appSecret = RetrieveSecretFromKV -SecretName $SPNId -KeyVaultName $KeyVaultName
$kvSecretBytes = [System.Convert]::FromBase64String($appSecret)
$certificate=[System.Security.Cryptography.X509Certificates.X509Certificate2]::new($kvSecretBytes, $null, [System.Security.Cryptography.X509Certificates.X509KeyStorageFlags]::MachineKeySet)
$thumbprint=$certificate.Thumbprint 
Install-Certificate -Certificate $certificate -StorePath "Cert:\CurrentUser\My"
Write-Host "Connect-AzAccount..."
$Script:AzContext = Connect-AzAccount -CertificateThumbprint $thumbprint -ApplicationId $SPNID -Tenant $TenantID -ServicePrincipal -Environment $AzureEnvironmentName -SubscriptionId $SubscriptionId
get-AzVM -DefaultProfile $Script:AzContext

Note that when run this concurrently in multiple instances of queue triggered azure ps function, it randomly failed. Most time, the first trigger fail (either after pushed new changes or did not run it for long time, like 24 hours), the second time after the first failures would work fine no matter how many instances are running.

spaelling commented 5 years ago

@bingbing8 @spaelling @arpparker The likely culprit here is an issue with the type converter in the job. The type converter is used in this case because the type of the cmdlet parameter is IAzureContextContainer, rather than IAzureContext

To work around the issue, you can pass in an IAzureContextContainer. In a tpical azure environment, this would mean passint the results of running Connect-AzAccount rather than of running Get-AzContext

so

$context = Connect-AzAccount -Subscription "My Subscription" -Tenant xxxx-xxxxxx-xxxxxx-yyyyy
$job = Start-Job -ArgumentList $context -ScriptBlock {param($AzContext) Get-AzVm -AzContext $AzContext ...}

I tried this with running 5 jobs, each with 10 nested jobs, and still got an error. I think even more than usual, but could just be coincidental.

$AzContext = Connect-AzAccount -Tenant 'TENANTID' -Subscription 'SUBSCRIPTIONID'
$scriptBlock = {
    param($AzContext)
    $jobs = @()
    for ($i = 0; $i -lt 10; $i++) {
        $jobs += Start-Job -ArgumentList $AzContext -ScriptBlock {
            param($AzContext)
            # make sure this is not $null (will then grab from Get-AzContext)
            if($null -eq $AzContext)
            {
                throw "Azure context is '`$null'"
            }
            $rg = $(Get-AzResourceGroup -AzContext $AzContext).Count
            if (-not $rg) {
                Write-Error "Hit an issue..."
            }
            else {
                Write-Output "No problem..."
            }
        }
    }

    if($jobs.Count -ne 0)
    {
        Write-Output "Waiting for $($jobs.Count) test runner jobs (NESTED) to complete"
        foreach ($job in $jobs){
            $result = Receive-Job $job -Wait
            Write-Output $result
        }
        Remove-Job -Job $jobs
    }
}

$jobs = @()

for ($i = 0; $i -lt 5; $i++) {
    $jobs += Start-Job -ScriptBlock $scriptBlock -ArgumentList $AzContext
}

if($jobs.Count -ne 0)
{
    Write-Output "Waiting for $($jobs.Count) test runner jobs to complete"
    foreach ($job in $jobs){
        $result = Receive-Job $job -Wait
        Write-Output $result
    }
    Remove-Job -Job $jobs
}
arpparker commented 5 years ago

@bingbing8 @spaelling @arpparker The likely culprit here is an issue with the type converter in the job. The type converter is used in this case because the type of the cmdlet parameter is IAzureContextContainer, rather than IAzureContext To work around the issue, you can pass in an IAzureContextContainer. In a tpical azure environment, this would mean passint the results of running Connect-AzAccount rather than of running Get-AzContext so

$context = Connect-AzAccount -Subscription "My Subscription" -Tenant xxxx-xxxxxx-xxxxxx-yyyyy
$job = Start-Job -ArgumentList $context -ScriptBlock {param($AzContext) Get-AzVm -AzContext $AzContext ...}

I tried this with running 5 jobs, each with 10 nested jobs, and still got an error. I think even more than usual, but could just be coincidental.

Ha, you beat me to this by like 10 minutes! :) Was just coming to post the same, I'm getting the same results. I think I have found a way workaround though based on one that was posted in a similar issue that I linked in the original post. More to come in a few minutes...

arpparker commented 5 years ago

The following, based essentially completely from this post, has worked 100% of the time for me:

Save-AzContext -Path "C:\Temp\AzContext.json" -Force

$scriptBlock = {
    $jobs = @()
    for ($i = 0; $i -lt 10; $i++) {
        $jobs += Start-Job -ScriptBlock {
            #Clear-AzContext -Force | Out-Null
            Disable-AzContextAutosave -Scope Process | Out-Null
            Import-AzContext -Path "C:\Temp\AzContext-Empty.json" | Out-Null
            Import-AzContext -Path "C:\Temp\AzContext.json" | Out-Null
            $AzContext = Get-AzContext

            $rg = $(Get-AzResourceGroup -AzContext $AzContext).Count
            if (-not $rg) {
                Write-Error "Hit an issue..."
            }
            else {
                Write-Output "No problem..."
            }
        }
    }

    if($jobs.Count -ne 0)
    {
        Write-Output "Waiting for $($jobs.Count) test runner jobs to complete"
        foreach ($job in $jobs){
            $result = Receive-Job $job -Wait
            Write-Output $result
        }
        Remove-Job -Job $jobs
    }
}

$jobs = @()

for ($i = 0; $i -lt 5; $i++) {
    $jobs += Start-Job -ScriptBlock $scriptBlock
}

if($jobs.Count -ne 0)
{
    Write-Output "Waiting for $($jobs.Count) test runner jobs to complete"
    foreach ($job in $jobs){
        $result = Receive-Job $job -Wait
        Write-Output $result
    }
    Remove-Job -Job $jobs
}

A couple notes:

I don't really understand why this works, but it does seem to work. What are the potential security implications of saving the context to disk?

spaelling commented 5 years ago
Disable-AzContextAutosave -Scope Process | Out-Null

The following, based essentially completely from this post, has worked 100% of the time for me:

Save-AzContext -Path "C:\Temp\AzContext.json" -Force

$scriptBlock = {
    $jobs = @()
    for ($i = 0; $i -lt 10; $i++) {
        $jobs += Start-Job -ScriptBlock {
            #Clear-AzContext -Force | Out-Null
            Disable-AzContextAutosave -Scope Process | Out-Null
            Import-AzContext -Path "C:\Temp\AzContext-Empty.json" | Out-Null
            Import-AzContext -Path "C:\Temp\AzContext.json" | Out-Null
            $AzContext = Get-AzContext

            $rg = $(Get-AzResourceGroup -AzContext $AzContext).Count
            if (-not $rg) {
                Write-Error "Hit an issue..."
            }
            else {
                Write-Output "No problem..."
            }
        }
    }

    if($jobs.Count -ne 0)
    {
        Write-Output "Waiting for $($jobs.Count) test runner jobs to complete"
        foreach ($job in $jobs){
            $result = Receive-Job $job -Wait
            Write-Output $result
        }
        Remove-Job -Job $jobs
    }
}

$jobs = @()

for ($i = 0; $i -lt 5; $i++) {
    $jobs += Start-Job -ScriptBlock $scriptBlock
}

if($jobs.Count -ne 0)
{
    Write-Output "Waiting for $($jobs.Count) test runner jobs to complete"
    foreach ($job in $jobs){
        $result = Receive-Job $job -Wait
        Write-Output $result
    }
    Remove-Job -Job $jobs
}

A couple notes:

* The Clear-AzContext doesn't seem to be required (at least for this particular issue)

* If Disable-AzContextAutosave isn't present, the error "The process cannot access the file 'C:...\TokenCache.dat' because it is being used by another process." appears multiple times.

* An empty context file can be created by saving the context when there is no context (also mentioned in the post I linked above).

I don't really understand why this works, but it does seem to work. What are the potential security implications of saving the context to disk?

The implication is that you are committing the access-token to disk. That can be problematic as someone could potentially elevate their access in Azure by having access to this file.

I find it odd that you have to import an empty context. I have done the same as you, but passing the access-token and an accountid along. But this still fails, which is odd. It should be pretty equivalent of logging in as a service principal.


<#
.SYNOPSIS
Get cachec access token

.DESCRIPTION
Get cachec access token

.EXAMPLE
An example

.NOTES
This will fail if multiple accounts are logged in (to the same tenant?), check with Get-AzContext -ListAvailable, there should be only one listed
Remove accounts using Disconnect-AzAccount
#>

function Get-AzCachedAccessToken()
{
    $ErrorActionPreference = 'Stop'

    if(-not (Get-Module Az.Accounts)) {
        Import-Module Az.Accounts
    }
    $azProfile = [Microsoft.Azure.Commands.Common.Authentication.Abstractions.AzureRmProfileProvider]::Instance.Profile
    if(-not $azProfile.Accounts.Count) {
        Write-Error "Ensure you have logged in before calling this function."    
    }

    $currentAzureContext = Get-AzContext
    $profileClient = New-Object Microsoft.Azure.Commands.ResourceManager.Common.RMProfileClient($azProfile)
    Write-Debug ("Getting access token for tenant" + $currentAzureContext.Tenant.TenantId)
    $token = $profileClient.AcquireAccessToken($currentAzureContext.Tenant.TenantId)
    $token.AccessToken
}

$Token = Get-AzCachedAccessToken
$AccountId = (Get-AzContext).Account.Id
#Connect-AzAccount -AccessToken $Token -AccountId $AccountId
cls

$scriptBlock = {
    param($Token, $AccountId)
    $jobs = @()
    for ($i = 0; $i -lt 5; $i++) {
        $jobs += Start-Job -ArgumentList $Token, $AccountId -ScriptBlock {
            param($Token, $AccountId)
            Disable-AzContextAutosave -Scope Process | Out-Null
            $AzContext = Connect-AzAccount -AccessToken $Token -AccountId $AccountId -Scope Process -ErrorAction SilentlyContinue
            if($null -eq $Token -or $null -eq $AzContext)
            {
                throw "Azure Token/Context is '`$null'"
            }
            $rg = $(Get-AzResourceGroup -AzContext $AzContext -ErrorAction SilentlyContinue).Count
            if (-not $rg) {
                Write-Error "Hit an issue..."
            }
            else {
                Write-Output "No problem..."
            }
        }
    }

    if($jobs.Count -ne 0)
    {
        Write-Output "Waiting for $($jobs.Count) test runner jobs (NESTED) to complete"
        foreach ($job in $jobs){
            $result = Receive-Job $job -Wait
            Write-Output $result
        }
        Remove-Job -Job $jobs
    }
}

$jobs = @()

for ($i = 0; $i -lt 10; $i++) {
    $jobs += Start-Job -ScriptBlock $scriptBlock -ArgumentList $Token, $AccountId
}

if($jobs.Count -ne 0)
{
    Write-Output "Waiting for $($jobs.Count) test runner jobs to complete"
    foreach ($job in $jobs){
        $result = Receive-Job $job -Wait
        Write-Output $result
    }
    Remove-Job -Job $jobs
}
bingbing8 commented 5 years ago

@spaelling, it looks like your workaround works when the login in different jobs are same context so you can write/read the context from disk. For my case, we login to different tenant with different subscription in different function. It does not work that way.

iamshital commented 5 years ago

I tried all the workarounds mentioned here, but doesn't seem to be working anything.. I will try to revert to older version of Az powershell and will update here...

WP0 commented 5 years ago

I've been struggling with this for a week trying to upgrade our existing deployment scripts from AzureRM to Az. None of the workarounds posted here are working for us. Neither exporting/importing the context to a file nor passing the context to the scriptblock works. We make multiple calls to azure endpoints within the scriptblocks in parallel and we won't be able to finish migrating to Az until this works.

spaelling commented 5 years ago

I had hoped that runspaces would not have this issue, but alas, the same.

cls

<#
.SYNOPSIS
Get cachec access token

.DESCRIPTION
Get cachec access token

.EXAMPLE
An example

.NOTES
This will fail if multiple accounts are logged in (to the same tenant?), check with Get-AzContext -ListAvailable, there should be only one listed
Remove accounts using Disconnect-AzAccount
#>
function Get-AzCachedAccessToken()
{
    $ErrorActionPreference = 'Stop'

    if(-not (Get-Module Az.Accounts)) {
        Import-Module Az.Accounts
    }
    $azProfile = [Microsoft.Azure.Commands.Common.Authentication.Abstractions.AzureRmProfileProvider]::Instance.Profile
    if(-not $azProfile.Accounts.Count) {
        Write-Error "Ensure you have logged in before calling this function."    
    }

    $currentAzureContext = Get-AzContext
    $profileClient = New-Object Microsoft.Azure.Commands.ResourceManager.Common.RMProfileClient($azProfile)
    Write-Debug ("Getting access token for tenant" + $currentAzureContext.Tenant.TenantId)
    $token = $profileClient.AcquireAccessToken($currentAzureContext.Tenant.TenantId)
    $token.AccessToken
}

$Token = Get-AzCachedAccessToken
$AccountId = (Get-AzContext).Account.Id

$scriptBlock = {
    param($Token, $AccountId, $j, $Runs)

    # Write-Host "Inner: Setting up runspaces"
    $sessionstate = [system.management.automation.runspaces.initialsessionstate]::CreateDefault()

    $sessionstate.Variables.Add(
        (New-Object System.Management.Automation.Runspaces.SessionStateVariableEntry('Results', $Results, $null))
    )

    $runspacepool = [runspacefactory]::CreateRunspacePool(1, [int]$env:NUMBER_OF_PROCESSORS+1, $sessionstate, $Host)
    $runspacepool.Open()
    $runspaces = @()

    # Write-Host "Inner: Invoking $Runs runspaces"
    for ($i = 0; $i -lt $Runs; $i++) {
        $runspace = [powershell]::Create()
        $runspace.RunspacePool = $runspacepool
        $runspace.AddScript({
            param($Token, $AccountId, $i, $j)

            Disable-AzContextAutosave -Scope Process | Out-Null
            $null = Connect-AzAccount -AccessToken $Token -AccountId $AccountId -Scope Process -ErrorAction SilentlyContinue
            if($null -eq $Token)
            {
                $msg = "Azure Token is '`$null'"
                $Results["$j-$i"] = $msg
                throw $msg
            }

            $rg = $null
            try {
                $rg = (Get-AzResourceGroup -ErrorAction Stop).Count
            }
            catch {
                $Results["$j-$i"] = "$_"
            }

            if ($null -ne $rg) {
                # spammy!
                $Results["$j-$i"] = "Found $rg resource groups"
            }
        }).AddParameter('Token', $Token).AddParameter('AccountId', $AccountId).AddParameter('i', $i).AddParameter('j', $j) > $null
        $runspaces += [PSCustomObject]@{ Pipe = $runspace; Status = $runspace.BeginInvoke() }
    }
    # Write-Host "Inner: Waiting for runspaces to complete"
    while($runspaces.Status.IsCompleted -contains $false){Start-Sleep -Milliseconds 10}
}

# controls how many jobs to run
$Runs = 10
$RunsNested = 5
Write-Host "Connecting and fetching resource groups in Azure $($Runs*$RunsNested) times"

Write-Host "Setting up runspaces"
$sessionstate = [system.management.automation.runspaces.initialsessionstate]::CreateDefault()
$Results = [HashTable]::Synchronized(@{})
$sessionstate.Variables.Add(
    (New-Object System.Management.Automation.Runspaces.SessionStateVariableEntry('Results', $Results, $null))
)

$runspacepool = [runspacefactory]::CreateRunspacePool(1, [int]$env:NUMBER_OF_PROCESSORS+1, $sessionstate, $Host)
$runspacepool.Open()
$runspaces = @()

Write-Host "Invoking $Runs runspaces"
for ($i = 0; $i -lt $Runs; $i++) {
    $runspace = [powershell]::Create()
    $runspace.RunspacePool = $runspacepool
    $runspace.AddScript($scriptBlock).AddParameter('Token', $Token).AddParameter('AccountId', $AccountId).AddParameter('j', $i).AddParameter('Runs', $RunsNested) > $null
    $runspaces += [PSCustomObject]@{ Pipe = $runspace; Status = $runspace.BeginInvoke() }
}
Write-Host "Waiting for runspaces to complete"
while($runspaces.Status.IsCompleted -contains $false){Start-Sleep -Milliseconds 10}

foreach ($Result in $Results.GetEnumerator()) {
    Write-Output $Result
}
ohadschn commented 5 years ago
  1. If I save the context to a file, I get The process cannot access the file TokenCache.dat because it is being used by another process
  2. If I use Enable-AzureRmContextAutosave -Scope CurrentUser to work around that, I get Your Azure credentials have not been set up or have expired, please run Connect-AzAccount to set up your Azure credentials.
  3. The empty context load workaround looks like a hack, I'd prefer not to rely on it.
  4. Adding retries when Enable-AzureRmContextAutosave is in effect doesn't help, once a job was started with a "bad" context, they all fail.
  5. However adding retries to the first approach (saving context to a file) does work, but still pretty hacky.

So my solution is importing the profile under a lock:

$mutex = [System.Threading.Mutex]::new($false, "foo")
$mutex.WaitOne()
try 
{ 
    Import-AzContext -Path $ProfilePath
}
finally
{
    $mutex.ReleaseMutex()
}

EDIT - looks like retries are still necessary, I guess Import-AzContext keeps the file open. Maybe Clear-AzContext, I'll try.

invisibleaxm commented 5 years ago

Hello,

FWIW i wanted to mention that we are too encountering a similar issue while running our ps code in Azure Functions. Similar to @bingbing8 our use case is to login to several different tenants/subscriptions for management and we are randomly experiencing this same issue.

erich-wang commented 5 years ago

@isra-fel, could you take a look this issue?

vinyar commented 4 years ago

I'm on a Mac in powershell and every single time I switch context I have to reauthenticate. 100% of the time. It's very frustrating - the docs outline switching context being a seamless experience, but it doesn't seem to work as intended.

asifma commented 4 years ago

I had the same issue when running jobs in multiple threads via a "ForEach-Parallel" function.

This is how I at least solved it, including the function as well (not written by me).

Whenever the error messages appears, i retry connecting again. This has been working for me 100% of the times.

function ForEach-Parallel { 
    <# 
.SYNOPSIS 
A parallel ForEach that uses runspaces 

.PARAMETER ScriptBlock 
ScriptBlock to execute for each InputObject 

.PARAMETER ScriptFile 
Script file to execute for each InputObject 

.PARAMETER InputObject 
Object(s) to run script against in parallel 

.PARAMETER Throttle 
Maximum number of threads to run at one time.  Default: 5 

.PARAMETER Timeout 
Stop each thread after this many minutes.  Default: 0 

WARNING:  This parameter should be used as a failsafe only 
Set it for roughly the entire duration you expect for all threads to complete 

.PARAMETER SleepTimer 
When looping through open threads, wait this many milliseconds before looping again.  Default: 200 

.EXAMPLE 
(0..50) | ForEach-Parallel -Throttle 4 { $_; sleep (Get-Random -Minimum 0 -Maximum 5) } 
} 

Send the number 0 through 50 to scriptblock.  For each, display the number and then sleep for 0 to 5 seconds.  Only execute 4 threads at a time. 

.EXAMPLE 
$servers | Foreach-Parallel -Throttle 20 -Timeout 60 -sleeptimer 200 -verbose -scriptFile C:\query.ps1 

Run query.ps1 against each computer in $servers.  Run 20 threads at a time, timeout a thread if it takes longer than 60 minutes to run, give verbose output. 

.FUNCTIONALITY  
PowerShell Language 

.NOTES 
Credit to Tome Tanasovski 
http://powertoe.wordpress.com/2012/05/03/foreach-parallel/ 
#> 
    [cmdletbinding()] 
    param( 
        [Parameter(Mandatory = $false, position = 0, ParameterSetName = 'ScriptBlock')] 
        [System.Management.Automation.ScriptBlock]$ScriptBlock, 

        [Parameter(Mandatory = $false, ParameterSetName = 'ScriptFile')] 
        [ValidateScript( { test-path $_ -pathtype leaf })] 
        $scriptFile, 

        [Parameter(Mandatory = $true, ValueFromPipeline = $true)] 
        [PSObject]$InputObject, 

        [int]$Throttle = 5, 

        [double]$sleepTimer = 200, 

        [double]$Timeout = 0 
    ) 
    BEGIN { 

        #Build the scriptblock depending on the parameter used 
        switch ($PSCmdlet.ParameterSetName) { 
            'ScriptBlock' { $ScriptBlock = $ExecutionContext.InvokeCommand.NewScriptBlock("param(`$_)`r`n" + $Scriptblock.ToString()) } 
            'ScriptFile' { $scriptblock = [scriptblock]::Create($(get-content $scriptFile | out-string)) } 
            Default { Write-Error ("Must provide ScriptBlock or ScriptFile"); Return } 
        } 

        #Define the initial sessionstate, create the runspacepool 
        Write-Verbose "Creating runspace pool with $Throttle threads" 
        $sessionState = [system.management.automation.runspaces.initialsessionstate]::CreateDefault() 
        $pool = [Runspacefactory]::CreateRunspacePool(1, $Throttle, $sessionState, $host) 
        $pool.open() 

        #array to hold details on each thread 
        $threads = @() 

        #If inputObject is bound get a total count and set bound to true 
        $bound = $false 
        if ( $PSBoundParameters.ContainsKey("inputObject") ) { 
            $bound = $true 
            $totalCount = $inputObject.count 
        } 

    } 

    PROCESS { 

        $run = @' 
        #For each pipeline object, create a new powershell instance, add to runspacepool 
        $powershell = [powershell]::Create().addscript($scriptblock).addargument($InputObject) 
        $powershell.runspacepool=$pool 
        $startTime = get-date 

        #add references to inputobject, instance, handle and startTime to threads array 
        $threads += New-Object psobject -Property @{ 
            Object = $inputObject; 
            instance = $powershell; 
            handle = $powershell.begininvoke(); 
            startTime = $startTime 
        } 

        Write-Verbose "Added $inputobject to the runspacepool at $startTime" 
'@ 

        #Run the here string.  Put it in a foreach loop if it didn't come from the pipeline 
        if ($bound) {    
            $run = $run -replace 'inputObject', 'object' 
            foreach ($object in $inputObject) {  
                Invoke-Expression -command $run 
            } 
        } 

        else { 

            Invoke-Expression -command $run 
        } 

    } 
    END { 
        $notdone = $true 

        #Loop through threads. 
        while ($notdone) { 

            $notdone = $false 
            for ($i = 0; $i -lt $threads.count; $i++) { 
                $thread = $threads[$i] 
                if ($thread) { 

                    #If thread is complete, dispose of it. 
                    if ($thread.handle.iscompleted) { 
                        Write-verbose "Closing thread for $($thread.Object)" 
                        $thread.instance.endinvoke($thread.handle) 
                        $thread.instance.dispose() 
                        $threads[$i] = $null 
                    } 

                    #Thread exceeded maxruntime timeout threshold 
                    elseif ( $Timeout -ne 0 -and ( (get-date) - $thread.startTime ).totalminutes -gt $Timeout ) { 
                        Write-Error "Closing thread for $($thread.Object): Thread exceeded $Timeout minute limit" -TargetObject $thread.inputObject 
                        $thread.instance.dispose() 
                        $threads[$i] = $null 
                    } 

                    #Thread is running, loop again! 
                    else { 
                        $notdone = $true 
                    } 
                }            
            } 

            #Sleep for specified time before looping again 
            Start-Sleep -Milliseconds $sleepTimer 
        } 
        $pool.close() 
    } 
}

$resources = Get-AzResource -ResourceGroupName "XXX-rg" | Where { $_.ResourceType -eq 'Microsoft.Compute/virtualMachines' -or $_.ResourceType -eq 'Microsoft.Compute/virtualMachineScaleSets' -or $_.Type -eq 'Microsoft.Network/applicationGateways' }

if ($resources.length -gt 0) {
    Write-Output "`nGetting resources:"
    $resources | Foreach-Parallel -Throttle $resources.count -Timeout 600 -sleeptimer 200 {
        Write-Output "Fetching: $($_.name)"
        $count = 0
        do {
            try {

               #Just test fetching RG's
                $rg = (Get-AzResourceGroup -ErrorAction Stop).Count
                Write-Output $rg

                if ($_.type -eq "Microsoft.Compute/virtualMachines") {
                    $item = Get-AzVM -Name $_.name -ResourceGroupName $_.ResourceGroupName -ErrorAction Stop
                    Write-Output $item.Name
                    $success = $true
                }
                elseif ($_.type -eq "Microsoft.Compute/virtualMachineScaleSets") {
                    $item = Get-AzVmss -VMScaleSetName $_.name -ResourceGroupName $_.ResourceGroupName -ErrorAction Stop
                    Write-Output $item.Name
                    $success = $true
                }
                elseif ($_.type -eq "Microsoft.Network/applicationGateways") {
                    $item = Get-AzApplicationGateway -ResourceGroupName $appgw.ResourceGroupName -Name $appgw.Name -Verbose -ErrorAction Stop
                    Write-Output $item.Name
                    $success = $true
                }

            }
            catch {
                if ($_.Exception.Message -eq "Your Azure credentials have not been set up or have expired, please run Connect-AzAccount to set up your Azure credentials.") {
                    write-Output $_.Exception.Message
                    Write-Output "Will try to run Connect-AzAccount again"
                    Connect-AzAccount -ServicePrincipal -Credential $pscred -TenantId $tenantid
                    Select-AzSubscription -SubscriptionName $subscriptionName

                }
                else {
                    write-Output "Unknown Error:"
                    write-Output $_.Exception.Message
                }
            }

            $count++

        }until($count -eq 10 -or $success)
        if (-not($success)) { exit }
    }
}
jffaust commented 4 years ago

Also getting random occurrences of this error. Very frustrating.

AndreasPaulsson commented 4 years ago

Same thing here since moving from PowerShell 5 and AzureRM to PowerShell Core with Az module. Everything is running on Windows.

AndreasMWalter commented 4 years ago

Just converted a customers AzureRM based Workflow Runbook to Az, and am encountering the same issue.

Lingxi-Li commented 4 years ago

Same issue here. Randomly encounter the expired Azure credential issue in background jobs started with Start-Job. Not necessarily the first one that finishes.

Nalith commented 4 years ago

Had the same issue here with a "ForEach-Object -Parallel" loop. However, interestingly enough if I add a "Start-Sleep -Seconds (Get-Random -Maximum 10)" to my loop I'm not getting this anymore so wondering if its not something causing some form of block either locally or on the Azure APIs.

markcowl commented 4 years ago

At least for the Foreach-Object -Parallel, this is likely the issue: #11201

isra-fel commented 4 years ago

The foreach-object -parallel issue was fixed in #12041 and the fix will be in our next release, but the original issue seemed to have a different root cause. Still digging into it.

dingmeng-xue commented 4 years ago

@arpparker , could you check if the latest Az can reproduce the problem?

petehauge commented 4 years ago

@dingmeng-xue - I just upgraded to the latest version of Azure Powershell with this command (admin window): Install-Module -Name Az -AllowClobber -Force

And retried my repro from above and still see the same issue.

image

Here's the code:

$scriptBlock = {
    $jobs = @()
    for ($i = 0; $i -lt 10; $i++) {
        $jobs += Start-Job -ScriptBlock {
            $rg = $(Get-AzResourceGroup).Count
            if (-not $rg) {
                Write-Error "Hit an issue..."
            }
            else {
                Write-Output "No problem..."
            }
        }
    }

    if($jobs.Count -ne 0)
    {
        Write-Output "Waiting for $($jobs.Count) test runner jobs to complete"
        foreach ($job in $jobs){
            $result = Receive-Job $job -Wait
            Write-Output $result
        }
        Remove-Job -Job $jobs
    }
}

$jobs = @()

for ($i = 0; $i -lt 10; $i++) {
    $jobs += Start-Job -ScriptBlock $scriptBlock
}

if($jobs.Count -ne 0)
{
    Write-Output "Waiting for $($jobs.Count) test runner jobs to complete"
    foreach ($job in $jobs){
        $result = Receive-Job $job -Wait
        Write-Output $result
    }
    Remove-Job -Job $jobs
}
arpparker commented 4 years ago

@arpparker , could you check if the latest Az can reproduce the problem?

I'll give my reproduction steps a go tomorrow morning. But I can confirm that reproduction steps from @petehauge do indeed still fail for me. If I recall correctly, we aren't passing the Azure context into the script block in quite the same way though, so I'll definitely see if I can reproduce using my original steps. Stay tuned.

EDIT: Also, just to confirm, the latest version of the Az module should be v4.3.0, correct? That's what I'm seeing after running the Update-Module command for Az.

arpparker commented 4 years ago

@arpparker , could you check if the latest Az can reproduce the problem?

@dingmeng-xue, unfortunately the issue is not resolved. I was able to replicate the problem again using both my reproduction steps above and in the original script where I first discovered the issue.

As mentioned above, the version of Az installed is v4.3.

isra-fel commented 4 years ago

I believe this might be related to locks. Some of the jobs cannot get access to the token cache file, so they fall back to in-memory mode, which of course is empty and contains no access tokens, hence the error. And the reason why some jobs cannot get access is, we use a lock to protect the cache file, but it only works on the thread level, while PowerShell jobs are process-based.

I'm trying to figure out a solution. Will keep updating.

petehauge commented 4 years ago

Yes, I agree that it's probably related to locks. I was able to develop a workaround that seems to always work using a mutex before making any calls to Azure in each thread - the code is below. This tells me that as long as no jobs access the token cache at the same time they don't fail...

$scriptBlock = {

    $jobs = @()
    for ($i = 0; $i -lt 10; $i++) {
        $jobs += Start-Job -ScriptBlock {

            # WORKAROUND: https://github.com/Azure/azure-powershell/issues/9448
            $Mutex = New-Object -TypeName System.Threading.Mutex -ArgumentList $false, "Global\AzDtlLibrary"
            $Mutex.WaitOne() | Out-Null
            Enable-AzContextAutosave -Scope Process | Out-Null
            $rg = Get-AzResourceGroup | Out-Null
            $Mutex.ReleaseMutex() | Out-Null

            $rg = $(Get-AzResourceGroup).Count
            if (-not $rg) {
                Write-Error "Hit an issue..."
            }
            else {
                Write-Output "No problem..."
            }
        }
    }

    if($jobs.Count -ne 0)
    {
        Write-Output "Waiting for $($jobs.Count) test runner jobs to complete"
        foreach ($job in $jobs){
            $result = Receive-Job $job -Wait
            Write-Output $result
        }
        Remove-Job -Job $jobs
    }
}

$jobs = @()

for ($i = 0; $i -lt 10; $i++) {
    $jobs += Start-Job -ScriptBlock $scriptBlock
}

if($jobs.Count -ne 0)
{
    Write-Output "Waiting for $($jobs.Count) test runner jobs to complete"
    foreach ($job in $jobs){
        $result = Receive-Job $job -Wait
        Write-Output $result
    }
    Remove-Job -Job $jobs
}
arpparker commented 4 years ago

I believe this might be related to locks. Some of the jobs cannot get access to the token cache file, so they fall back to in-memory mode, which of course is empty and contains no access tokens, hence the error. And the reason why some jobs cannot get access is, we use a lock to protect the cache file, but it only works on the thread level, while PowerShell jobs are process-based.

I'm trying to figure out a solution. Will keep updating.

@isra-fel Thanks for the update! Looking forward to what you find.

Yes, I agree that it's probably related to locks. I was able to develop a workaround that seems to always work using a mutex before making any calls to Azure in each thread - the code is below. This tells me that as long as no jobs access the token cache at the same time they don't fail...

$scriptBlock = {

    $jobs = @()
    for ($i = 0; $i -lt 10; $i++) {
        $jobs += Start-Job -ScriptBlock {

            # WORKAROUND: https://github.com/Azure/azure-powershell/issues/9448
            $Mutex = New-Object -TypeName System.Threading.Mutex -ArgumentList $false, "Global\AzDtlLibrary"
            $Mutex.WaitOne() | Out-Null
            Enable-AzContextAutosave -Scope Process | Out-Null
            $rg = Get-AzResourceGroup | Out-Null
            $Mutex.ReleaseMutex() | Out-Null

            $rg = $(Get-AzResourceGroup).Count
            if (-not $rg) {
                Write-Error "Hit an issue..."
            }
            else {
                Write-Output "No problem..."
            }
        }
    }

    if($jobs.Count -ne 0)
    {
        Write-Output "Waiting for $($jobs.Count) test runner jobs to complete"
        foreach ($job in $jobs){
            $result = Receive-Job $job -Wait
            Write-Output $result
        }
        Remove-Job -Job $jobs
    }
}

$jobs = @()

for ($i = 0; $i -lt 10; $i++) {
    $jobs += Start-Job -ScriptBlock $scriptBlock
}

if($jobs.Count -ne 0)
{
    Write-Output "Waiting for $($jobs.Count) test runner jobs to complete"
    foreach ($job in $jobs){
        $result = Receive-Job $job -Wait
        Write-Output $result
    }
    Remove-Job -Job $jobs
}

@petehauge This sounds very promising, but this might extend beyond my level of expertise. Can you explain what exactly this is doing, I'm not sure I'm following. How would I (if possible) incorporate into my initial reproduction script in the original post considering I'm passing the Azure context as a parameter to the script block?

petehauge commented 4 years ago

@arpparker - sure! Basically, the code is insuring that the first line of code that needs to get a context is doing so only one at a time via using a single mutex. IE: you could incorporate this into your code this way:

$context = Connect-AzAccount -Subscription "My Subscription" -Tenant xxxx-xxxxxx-xxxxxx-yyyyy
$job = Start-Job -ArgumentList $context -ScriptBlock {
      param($AzContext) 
      # WORKAROUND: https://github.com/Azure/azure-powershell/issues/9448
      $Mutex = New-Object -TypeName System.Threading.Mutex -ArgumentList $false, "Global\MyCustomMutex"
      $Mutex.WaitOne() | Out-Null
      # Put only the first line of AZ powershell code here
      # this ensures that the first time on the thread we check tokens only one at a time
      $vms = Get-AzVm -AzContext $AzContext
      $Mutex.ReleaseMutex() | Out-Null

      # Additional code goes here
}

NOTE: I didn't test the code, but this is about what you would need... The first Az command in a script block needs to be guarded by a mutex so it executes only one at a time across all the threads.

isra-fel commented 4 years ago

Hi all, We've released Az.Accounts 1.9.1 fixing this issue. Could you update and try it? Thank you 😀

petehauge commented 4 years ago

This is such a relief! I ran through my repro above and it works, the issue looks like it's fixed! I really appreciate getting this fixed, I'm going to go throw away all my workaround code... :-) Thanks!

isra-fel commented 4 years ago

Allright I'll close the issue. Thanks to everyone providing information to help resolve the issue. Really appreciate that!

arpparker commented 4 years ago

I can also confirm this appears to be fixed for me as well.

@petehauge thanks for the explanation of the workaround a few weeks ago, I never even got a chance to try it. This is even better! :)

Thanks everyone!

PavelPikat commented 4 years ago

This issue re-appeared since yesterday (23rd of July 2020). Upgrading Az.Accounts to 1.9.1 doesn't help

dingmeng-xue commented 4 years ago

@PavelPikat , many reason may lead to this error. Could you raise a new issue and we will continue following up? If you can clarify further about your steps, it will be great.

PavelPikat commented 4 years ago

It seems to be Azure DevOps pipeline specific, so I raised https://github.com/microsoft/azure-pipelines-tasks/issues/13348 with them

Rakesh-vrk commented 4 years ago

When im deploying the resource using ARM i encountered the similar issue. and has been resolved using command Clear-AzContext -Force image image