We are experiencing an occasional issue in our pipelines where sometimes when an agent reportedly picks up a job, but it is never processed on the agent. Whenever this happens now, we have to cancel the whole pipeline, then wait for the job to not renew, then we can retry the failed job and it gets picked up correctly.
I connected to an affected agent and reviewed the Agent Logs during this timeframe, and we see the below issue. We see in the attached logs that the job request 7781742 was received, and then some authentication failed, however it proceeds to renew the job request for over 40 minutes (continued for over an hour) and the job is forever hung in devops (shows that job is picked up by some agent but doesn't specify which one and that it will start shortly but never does)
I feel the job should not continually renew the request if it cannot continue, it should fail and let the job be retried more quickly.
Is there some bug here? This is happening a few times a day and requires manual intervention/retries to be able to continue a pipelines.
Thanks
Job Received:
[2024-08-14 16:31:46Z INFO JobDispatcher] Job request 7781742 for plan 17b2fe67-a82d-403b-9eec-e75768140fa5 job 7d3607db-1d2c-5cfa-45ac-0d11af423ebd received.
Auth Failure:
[2024-08-14 16:31:47Z WARN VisualStudioServices] Authentication failed with status code 401.
Cache-Control: no-cache
Pragma: no-cache
P3P: CP="CAO DSP COR ADMa DEV CONo TELo CUR PSA PSD TAI IVDo OUR SAMi BUS DEM NAV STA UNI COM INT PHY ONL FIN PUR LOC CNT"
WWW-Authenticate: Bearer authorization_uri=https://login.microsoftonline.com/af382f31-6799-4585-a113-b5d387fa52ee, Basic realm="https://tfsprodcca1.visualstudio.com/", TFS-Federated
X-TFS-ProcessId: 29487de2-9c11-4f5b-b479-1b247bbf1ee6
Strict-Transport-Security: max-age=31536000; includeSubDomains
ActivityId: bbc19a7d-c961-4b49-8387-83e88e258fff
X-TFS-Session: f72ad34d-ddd3-4813-8a79-b444aca0b9a9
X-VSS-E2EID: f114a3af-7d7f-4107-874f-aae016555c82
X-VSS-SenderDeploymentId: 4ff21e82-8865-0b2e-ffe8-9598818f8190
X-Frame-Options: SAMEORIGIN
X-TFS-FedAuthRealm: https://tfsprodcca1.visualstudio.com/
X-TFS-FedAuthIssuer: https://dev.azure.com/ramsoft-inc/
X-VSS-AuthorizationEndpoint: https://vssps.dev.azure.com/ramsoft-inc/
X-VSS-ResourceTenant: af382f31-6799-4585-a113-b5d387fa52ee
X-VSS-S2STargetService: 00000002-0000-8888-8000-000000000000/visualstudio.com
X-TFS-FedAuthRedirect: https://spsprodcca1.vssps.visualstudio.com/_signin?realm=dev.azure.com&reply_to=https%3A%2F%2Fdev.azure.com%2Framsoft-inc%2F_apis%2FFeatureFlags%2FDistributedTask.Agent.UseMaskingPerformanceEnhancements%3FcheckFeatureExists%3DFalse&redirect=1&hid=e5c6c4b4-9476-4df6-9b9f-50c0987ffaf6&context=eyJodCI6MiwiaGlkIjoiMmM3YTA4MzgtNmJlYS00YWZmLWExZDMtZTk0ZjZjMDZlNzMxIiwicXMiOnt9LCJyciI6IiIsInZoIjoiIiwiY3YiOiIiLCJjcyI6IiJ90#ctx=eyJTaWduSW5Db29raWVEb21haW5zIjpbImh0dHBzOi8vbG9naW4ubWljcm9zb2Z0b25saW5lLmNvbSIsImh0dHBzOi8vbG9naW4ubWljcm9zb2Z0b25saW5lLmNvbSJdfQ2
Request-Context: appId=cid-v1:72d31d95-1757-44f5-b910-a46611808454
Access-Control-Expose-Headers: Request-Context
X-Content-Type-Options: nosniff
X-Cache: CONFIG_NOCACHE
X-MSEdge-Ref: Ref A: 126D19CC48CC4CAAB271813D2D1BA643 Ref B: YTO01EDGE0820 Ref C: 2024-08-14T16:31:47Z
Date: Wed, 14 Aug 2024 16:31:47 GMT
Renews over and over:
[2024-08-14 16:31:47Z INFO RSAFileKeyManager] Loading RSA key parameters from file /azp/.credentials_rsaparams
[2024-08-14 16:32:37Z INFO MessageListener] Sleeping for 11.008 seconds before retrying.
[2024-08-14 16:32:38Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 547938, session 'c9488daa-584d-49f0-a57f-134e7de5cb70'.
[2024-08-14 16:32:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 16:42:47
.....
[2024-08-14 16:33:27Z WARN VisualStudioServices] Retrieving an AAD auth token took a long time (99.534524 seconds)
[2024-08-14 16:33:27Z WARN VisualStudioServices] GET request to https://dev.azure.com/ramsoft-inc/_apis/FeatureFlags/DistributedTask.Agent.UseMaskingPerformanceEnhancements timed out after 100 seconds.
[2024-08-14 16:33:38Z INFO MessageListener] Sleeping for 10.841 seconds before retrying.
[2024-08-14 16:33:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 16:43:47
.....
[2024-08-14 16:37:46Z INFO MessageListener] Sleeping for 11.524 seconds before retrying.
[2024-08-14 16:37:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 16:47:47
[2024-08-14 16:37:58Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 547938, session 'c9488daa-584d-49f0-a57f-134e7de5cb70'.
[2024-08-14 16:38:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 16:48:47
[2024-08-14 16:38:48Z INFO MessageListener] Sleeping for 6.555 seconds before retrying.
....
etc
Versions
Azure DevOps Agent Version 3.242.1 / Ubuntu 22.04 running in Docker
Environment type (Please select at least one enviroment where you face this issue)
[X] Self-Hosted
[ ] Microsoft Hosted
[ ] VMSS Pool
[X] Container
Azure DevOps Server type
dev.azure.com (formerly visualstudio.com)
Azure DevOps Server Version (if applicable)
No response
Operation system
Ubuntu 22.04
Version controll system
No response
Relevant log output
[2024-08-14 16:31:46Z INFO JobDispatcher] Job request 7781742 for plan 17b2fe67-a82d-403b-9eec-e75768140fa5 job 7d3607db-1d2c-5cfa-45ac-0d11af423ebd received.
[2024-08-14 16:31:46Z INFO JobDispatcher] Job request cc7f67d6-cc59-5cd0-b322-73f028735bc7 processed succeed.
[2024-08-14 16:31:46Z INFO Terminal] WRITE LINE: 2024-08-14 16:31:46Z: Running job: Build Omega AI
[2024-08-14 16:31:46Z INFO AgentServer] Refresh JobRequest VssConnection to get on a different AFD node.
[2024-08-14 16:31:46Z INFO AgentServer] Establish connection with 30 seconds timeout.
[2024-08-14 16:31:46Z INFO VisualStudioServices] Starting operation Location.GetConnectionData
[2024-08-14 16:31:47Z INFO VisualStudioServices] Finished operation Location.GetConnectionData
[2024-08-14 16:31:47Z INFO JobDispatcher] Start renew job request 7781742 for job 7d3607db-1d2c-5cfa-45ac-0d11af423ebd.
[2024-08-14 16:31:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 16:41:47
[2024-08-14 16:31:47Z INFO ConfigurationStore] HasCredentials()
[2024-08-14 16:31:47Z INFO ConfigurationStore] stored True
[2024-08-14 16:31:47Z INFO CredentialManager] GetCredentialProvider
[2024-08-14 16:31:47Z INFO CredentialManager] Creating type OAuth
[2024-08-14 16:31:47Z INFO CredentialManager] Creating credential type: OAuth
[2024-08-14 16:31:47Z INFO RSAFileKeyManager] Loading RSA key parameters from file /azp/.credentials_rsaparams
[2024-08-14 16:31:47Z INFO ConfigurationManager] LoadSettings
[2024-08-14 16:31:47Z INFO ConfigurationStore] IsConfigured()
[2024-08-14 16:31:47Z INFO ConfigurationStore] IsConfigured: True
[2024-08-14 16:31:47Z INFO ConfigurationManager] Is configured: True
[2024-08-14 16:31:47Z INFO ConfigurationManager] Settings Loaded
[2024-08-14 16:31:47Z WARN VisualStudioServices] Authentication failed with status code 401.
Cache-Control: no-cache
Pragma: no-cache
P3P: CP="CAO DSP COR ADMa DEV CONo TELo CUR PSA PSD TAI IVDo OUR SAMi BUS DEM NAV STA UNI COM INT PHY ONL FIN PUR LOC CNT"
WWW-Authenticate: Bearer authorization_uri=https://login.microsoftonline.com/af382f31-6799-4585-a113-b5d387fa52ee, Basic realm="https://tfsprodcca1.visualstudio.com/", TFS-Federated
X-TFS-ProcessId: 29487de2-9c11-4f5b-b479-1b247bbf1ee6
Strict-Transport-Security: max-age=31536000; includeSubDomains
ActivityId: bbc19a7d-c961-4b49-8387-83e88e258fff
X-TFS-Session: f72ad34d-ddd3-4813-8a79-b444aca0b9a9
X-VSS-E2EID: f114a3af-7d7f-4107-874f-aae016555c82
X-VSS-SenderDeploymentId: 4ff21e82-8865-0b2e-ffe8-9598818f8190
X-Frame-Options: SAMEORIGIN
X-TFS-FedAuthRealm: https://tfsprodcca1.visualstudio.com/
X-TFS-FedAuthIssuer: https://dev.azure.com/ramsoft-inc/
X-VSS-AuthorizationEndpoint: https://vssps.dev.azure.com/ramsoft-inc/
X-VSS-ResourceTenant: af382f31-6799-4585-a113-b5d387fa52ee
X-VSS-S2STargetService: 00000002-0000-8888-8000-000000000000/visualstudio.com
X-TFS-FedAuthRedirect: https://spsprodcca1.vssps.visualstudio.com/_signin?realm=dev.azure.com&reply_to=https%3A%2F%2Fdev.azure.com%2Framsoft-inc%2F_apis%2FFeatureFlags%2FDistributedTask.Agent.UseMaskingPerformanceEnhancements%3FcheckFeatureExists%3DFalse&redirect=1&hid=e5c6c4b4-9476-4df6-9b9f-50c0987ffaf6&context=eyJodCI6MiwiaGlkIjoiMmM3YTA4MzgtNmJlYS00YWZmLWExZDMtZTk0ZjZjMDZlNzMxIiwicXMiOnt9LCJyciI6IiIsInZoIjoiIiwiY3YiOiIiLCJjcyI6IiJ90#ctx=eyJTaWduSW5Db29raWVEb21haW5zIjpbImh0dHBzOi8vbG9naW4ubWljcm9zb2Z0b25saW5lLmNvbSIsImh0dHBzOi8vbG9naW4ubWljcm9zb2Z0b25saW5lLmNvbSJdfQ2
Request-Context: appId=cid-v1:72d31d95-1757-44f5-b910-a46611808454
Access-Control-Expose-Headers: Request-Context
X-Content-Type-Options: nosniff
X-Cache: CONFIG_NOCACHE
X-MSEdge-Ref: Ref A: 126D19CC48CC4CAAB271813D2D1BA643 Ref B: YTO01EDGE0820 Ref C: 2024-08-14T16:31:47Z
Date: Wed, 14 Aug 2024 16:31:47 GMT
[2024-08-14 16:31:47Z INFO RSAFileKeyManager] Loading RSA key parameters from file /azp/.credentials_rsaparams
[2024-08-14 16:31:47Z INFO RSAFileKeyManager] Loading RSA key parameters from file /azp/.credentials_rsaparams
[2024-08-14 16:32:37Z INFO MessageListener] Sleeping for 11.008 seconds before retrying.
[2024-08-14 16:32:38Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 547938, session 'c9488daa-584d-49f0-a57f-134e7de5cb70'.
[2024-08-14 16:32:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 16:42:47
[2024-08-14 16:33:27Z WARN VisualStudioServices] Retrieving an AAD auth token took a long time (99.534524 seconds)
[2024-08-14 16:33:27Z WARN VisualStudioServices] GET request to https://dev.azure.com/ramsoft-inc/_apis/FeatureFlags/DistributedTask.Agent.UseMaskingPerformanceEnhancements timed out after 100 seconds.
[2024-08-14 16:33:38Z INFO MessageListener] Sleeping for 10.841 seconds before retrying.
[2024-08-14 16:33:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 16:43:47
[2024-08-14 16:33:58Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 547938, session 'c9488daa-584d-49f0-a57f-134e7de5cb70'.
[2024-08-14 16:34:39Z INFO MessageListener] Sleeping for 8.39 seconds before retrying.
[2024-08-14 16:34:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 16:44:47
[2024-08-14 16:35:18Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 547938, session 'c9488daa-584d-49f0-a57f-134e7de5cb70'.
[2024-08-14 16:35:37Z INFO MessageListener] Sleeping for 13.746 seconds before retrying.
[2024-08-14 16:35:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 16:45:47
[2024-08-14 16:36:38Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 547938, session 'c9488daa-584d-49f0-a57f-134e7de5cb70'.
[2024-08-14 16:36:41Z INFO MessageListener] Sleeping for 14.846 seconds before retrying.
[2024-08-14 16:36:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 16:46:47
[2024-08-14 16:37:46Z INFO MessageListener] Sleeping for 11.524 seconds before retrying.
[2024-08-14 16:37:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 16:47:47
[2024-08-14 16:37:58Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 547938, session 'c9488daa-584d-49f0-a57f-134e7de5cb70'.
[2024-08-14 16:38:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 16:48:47
[2024-08-14 16:38:48Z INFO MessageListener] Sleeping for 6.555 seconds before retrying.
[2024-08-14 16:39:18Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 547938, session 'c9488daa-584d-49f0-a57f-134e7de5cb70'.
[2024-08-14 16:39:44Z INFO MessageListener] Sleeping for 7.241 seconds before retrying.
[2024-08-14 16:39:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 16:49:47
[2024-08-14 16:40:38Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 547938, session 'c9488daa-584d-49f0-a57f-134e7de5cb70'.
[2024-08-14 16:40:41Z INFO MessageListener] Sleeping for 11.079 seconds before retrying.
[2024-08-14 16:40:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 16:50:47
[2024-08-14 16:41:42Z INFO MessageListener] Sleeping for 11.762 seconds before retrying.
[2024-08-14 16:41:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 16:51:47
[2024-08-14 16:41:58Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 547938, session 'c9488daa-584d-49f0-a57f-134e7de5cb70'.
[2024-08-14 16:42:44Z INFO MessageListener] Sleeping for 9.927 seconds before retrying.
[2024-08-14 16:42:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 16:52:47
[2024-08-14 16:43:18Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 547938, session 'c9488daa-584d-49f0-a57f-134e7de5cb70'.
[2024-08-14 16:43:44Z INFO MessageListener] Sleeping for 13.423 seconds before retrying.
[2024-08-14 16:43:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 16:53:47
[2024-08-14 16:44:38Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 547938, session 'c9488daa-584d-49f0-a57f-134e7de5cb70'.
[2024-08-14 16:44:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 16:54:47
[2024-08-14 16:44:48Z INFO MessageListener] Sleeping for 7.412 seconds before retrying.
[2024-08-14 16:45:45Z INFO MessageListener] Sleeping for 14.426 seconds before retrying.
[2024-08-14 16:45:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 16:55:47
[2024-08-14 16:45:58Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 547938, session 'c9488daa-584d-49f0-a57f-134e7de5cb70'.
[2024-08-14 16:46:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 16:56:47
[2024-08-14 16:46:49Z INFO MessageListener] Sleeping for 7.338 seconds before retrying.
[2024-08-14 16:46:57Z INFO MessageListener] Sleeping for 7.831 seconds before retrying.
[2024-08-14 16:47:18Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 547938, session 'c9488daa-584d-49f0-a57f-134e7de5cb70'.
[2024-08-14 16:47:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 16:57:47
[2024-08-14 16:47:55Z INFO MessageListener] Sleeping for 13.388 seconds before retrying.
[2024-08-14 16:48:38Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 547938, session 'c9488daa-584d-49f0-a57f-134e7de5cb70'.
[2024-08-14 16:48:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 16:58:47
[2024-08-14 16:48:58Z INFO MessageListener] Sleeping for 14.563 seconds before retrying.
[2024-08-14 16:49:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 16:59:47
[2024-08-14 16:49:58Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 547938, session 'c9488daa-584d-49f0-a57f-134e7de5cb70'.
[2024-08-14 16:50:03Z INFO MessageListener] Sleeping for 13.06 seconds before retrying.
[2024-08-14 16:50:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 17:00:47
[2024-08-14 16:51:06Z INFO MessageListener] Sleeping for 11.008 seconds before retrying.
[2024-08-14 16:51:18Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 547938, session 'c9488daa-584d-49f0-a57f-134e7de5cb70'.
[2024-08-14 16:51:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 17:01:47
[2024-08-14 16:52:07Z INFO MessageListener] Sleeping for 10.841 seconds before retrying.
[2024-08-14 16:52:38Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 547938, session 'c9488daa-584d-49f0-a57f-134e7de5cb70'.
[2024-08-14 16:52:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 17:02:47
[2024-08-14 16:53:08Z INFO MessageListener] Sleeping for 8.39 seconds before retrying.
[2024-08-14 16:53:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 17:03:47
[2024-08-14 16:53:58Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 547938, session 'c9488daa-584d-49f0-a57f-134e7de5cb70'.
[2024-08-14 16:54:06Z INFO MessageListener] Sleeping for 13.746 seconds before retrying.
[2024-08-14 16:54:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 17:04:47
[2024-08-14 16:55:10Z INFO MessageListener] Sleeping for 14.846 seconds before retrying.
[2024-08-14 16:55:18Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 547938, session 'c9488daa-584d-49f0-a57f-134e7de5cb70'.
[2024-08-14 16:55:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 17:05:47
[2024-08-14 16:56:15Z INFO MessageListener] Sleeping for 11.524 seconds before retrying.
[2024-08-14 16:56:38Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 547938, session 'c9488daa-584d-49f0-a57f-134e7de5cb70'.
[2024-08-14 16:56:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 17:06:47
[2024-08-14 16:57:08Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 547938, session 'c9488daa-584d-49f0-a57f-134e7de5cb70'.
[2024-08-14 16:57:16Z INFO MessageListener] Sleeping for 6.555 seconds before retrying.
[2024-08-14 16:57:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 17:07:47
[2024-08-14 16:58:13Z INFO MessageListener] Sleeping for 7.241 seconds before retrying.
[2024-08-14 16:58:28Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 547938, session 'c9488daa-584d-49f0-a57f-134e7de5cb70'.
[2024-08-14 16:58:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 17:08:47
[2024-08-14 16:59:10Z INFO MessageListener] Sleeping for 11.079 seconds before retrying.
[2024-08-14 16:59:21Z INFO MessageListener] Sleeping for 11.762 seconds before retrying.
[2024-08-14 16:59:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 17:09:47
[2024-08-14 16:59:48Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 547938, session 'c9488daa-584d-49f0-a57f-134e7de5cb70'.
[2024-08-14 17:00:23Z INFO MessageListener] Sleeping for 9.927 seconds before retrying.
[2024-08-14 17:00:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 17:10:47
[2024-08-14 17:01:08Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 547938, session 'c9488daa-584d-49f0-a57f-134e7de5cb70'.
[2024-08-14 17:01:23Z INFO MessageListener] Sleeping for 13.423 seconds before retrying.
[2024-08-14 17:01:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 17:11:47
[2024-08-14 17:02:26Z INFO MessageListener] No message retrieved from session 'c9488daa-584d-49f0-a57f-134e7de5cb70' within last 30 minutes.
[2024-08-14 17:02:26Z INFO MessageListener] Sleeping for 7.412 seconds before retrying.
[2024-08-14 17:02:28Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 547938, session 'c9488daa-584d-49f0-a57f-134e7de5cb70'.
[2024-08-14 17:02:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 17:12:47
[2024-08-14 17:03:24Z INFO MessageListener] Sleeping for 14.426 seconds before retrying.
[2024-08-14 17:03:47Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 17:13:47
[2024-08-14 17:03:48Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 547938, session 'c9488daa-584d-49f0-a57f-134e7de5cb70'.
[2024-08-14 17:04:28Z INFO MessageListener] Sleeping for 7.338 seconds before retrying.
[2024-08-14 17:04:48Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 17:14:48
[2024-08-14 17:05:08Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 547938, session 'c9488daa-584d-49f0-a57f-134e7de5cb70'.
[2024-08-14 17:05:26Z INFO MessageListener] Sleeping for 7.831 seconds before retrying.
[2024-08-14 17:05:48Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 17:15:48
[2024-08-14 17:06:23Z INFO MessageListener] Sleeping for 13.388 seconds before retrying.
[2024-08-14 17:06:28Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 547938, session 'c9488daa-584d-49f0-a57f-134e7de5cb70'.
[2024-08-14 17:06:48Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 17:16:48
[2024-08-14 17:07:27Z INFO MessageListener] Sleeping for 14.563 seconds before retrying.
[2024-08-14 17:07:41Z WARN VisualStudioServices] Authentication failed with status code 401.
Cache-Control: no-cache
Pragma: no-cache
P3P: CP="CAO DSP COR ADMa DEV CONo TELo CUR PSA PSD TAI IVDo OUR SAMi BUS DEM NAV STA UNI COM INT PHY ONL FIN PUR LOC CNT"
WWW-Authenticate: Bearer authorization_uri=https://login.microsoftonline.com/af382f31-6799-4585-a113-b5d387fa52ee, Basic realm="https://tfsprodcca1.visualstudio.com/", TFS-Federated
X-TFS-ProcessId: f694617b-f631-4441-9765-00e571d68f93
Strict-Transport-Security: max-age=31536000; includeSubDomains
ActivityId: 8f04b4b3-f42f-4811-8169-1983cc89cf8f
X-TFS-Session: f72ad34d-ddd3-4813-8a79-b444aca0b9a9
X-VSS-E2EID: c2e3c5ae-75e4-4198-85c7-1391ab69205a
X-VSS-SenderDeploymentId: 4ff21e82-8865-0b2e-ffe8-9598818f8190
X-Frame-Options: SAMEORIGIN
X-TFS-FedAuthRealm: https://tfsprodcca1.visualstudio.com/
X-TFS-FedAuthIssuer: https://dev.azure.com/ramsoft-inc/
X-VSS-AuthorizationEndpoint: https://vssps.dev.azure.com/ramsoft-inc/
X-VSS-ResourceTenant: af382f31-6799-4585-a113-b5d387fa52ee
X-VSS-S2STargetService: 00000002-0000-8888-8000-000000000000/visualstudio.com
X-TFS-FedAuthRedirect: https://spsprodcca1.vssps.visualstudio.com/_signin?realm=dev.azure.com&reply_to=https%3A%2F%2Fdev.azure.com%2Framsoft-inc%2F_apis%2Fdistributedtask%2Fpools%2F92%2Fmessages%3FsessionId%3Dc9488daa-584d-49f0-a57f-134e7de5cb70%26lastMessageId%3D287&redirect=1&hid=e5c6c4b4-9476-4df6-9b9f-50c0987ffaf6&context=eyJodCI6MiwiaGlkIjoiMmM3YTA4MzgtNmJlYS00YWZmLWExZDMtZTk0ZjZjMDZlNzMxIiwicXMiOnt9LCJyciI6IiIsInZoIjoiIiwiY3YiOiIiLCJjcyI6IiJ90#ctx=eyJTaWduSW5Db29raWVEb21haW5zIjpbImh0dHBzOi8vbG9naW4ubWljcm9zb2Z0b25saW5lLmNvbSIsImh0dHBzOi8vbG9naW4ubWljcm9zb2Z0b25saW5lLmNvbSJdfQ2
Request-Context: appId=cid-v1:72d31d95-1757-44f5-b910-a46611808454
Access-Control-Expose-Headers: Request-Context
X-Content-Type-Options: nosniff
X-Cache: CONFIG_NOCACHE
X-MSEdge-Ref: Ref A: 1480E150DE1947A1AB67675142D2DF58 Ref B: YTO01EDGE0720 Ref C: 2024-08-14T17:07:41Z
Date: Wed, 14 Aug 2024 17:07:41 GMT
[2024-08-14 17:07:41Z INFO RSAFileKeyManager] Loading RSA key parameters from file /azp/.credentials_rsaparams
[2024-08-14 17:07:42Z INFO VisualStudioServices] AAD Correlation ID for this token request: Unknown
[2024-08-14 17:07:48Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 17:17:48
[2024-08-14 17:07:48Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 547938, session 'c9488daa-584d-49f0-a57f-134e7de5cb70'.
[2024-08-14 17:08:32Z INFO MessageListener] Sleeping for 13.06 seconds before retrying.
[2024-08-14 17:08:45Z INFO MessageListener] Sleeping for 11.008 seconds before retrying.
[2024-08-14 17:08:48Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 17:18:48
[2024-08-14 17:09:08Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 547938, session 'c9488daa-584d-49f0-a57f-134e7de5cb70'.
[2024-08-14 17:09:46Z INFO MessageListener] Sleeping for 10.841 seconds before retrying.
[2024-08-14 17:09:48Z INFO JobDispatcher] Successfully renew job request 7781742, job is valid till 08/14/2024 17:19:48
[2024-08-14 17:10:28Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 547938, session 'c9488daa-584d-49f0-a57f-134e7de5cb70'.
[2024-08-14 17:10:47Z INFO MessageListener] Sleeping for 8.39 seconds before retrying.
What happened?
Hi Team,
We are experiencing an occasional issue in our pipelines where sometimes when an agent reportedly picks up a job, but it is never processed on the agent. Whenever this happens now, we have to cancel the whole pipeline, then wait for the job to not renew, then we can retry the failed job and it gets picked up correctly.
I connected to an affected agent and reviewed the Agent Logs during this timeframe, and we see the below issue. We see in the attached logs that the job request 7781742 was received, and then some authentication failed, however it proceeds to renew the job request for over 40 minutes (continued for over an hour) and the job is forever hung in devops (shows that job is picked up by some agent but doesn't specify which one and that it will start shortly but never does)
I feel the job should not continually renew the request if it cannot continue, it should fail and let the job be retried more quickly.
Is there some bug here? This is happening a few times a day and requires manual intervention/retries to be able to continue a pipelines.
Thanks
Job Received:
[2024-08-14 16:31:46Z INFO JobDispatcher] Job request 7781742 for plan 17b2fe67-a82d-403b-9eec-e75768140fa5 job 7d3607db-1d2c-5cfa-45ac-0d11af423ebd received.
Auth Failure:
Renews over and over:
Versions
Azure DevOps Agent Version 3.242.1 / Ubuntu 22.04 running in Docker
Environment type (Please select at least one enviroment where you face this issue)
Azure DevOps Server type
dev.azure.com (formerly visualstudio.com)
Azure DevOps Server Version (if applicable)
No response
Operation system
Ubuntu 22.04
Version controll system
No response
Relevant log output