microsoft / azure-pipelines-agent

Azure Pipelines Agent 🚀
MIT License
1.73k stars 869 forks source link

Fix connection getting disposed #5039

Closed KathanS closed 9 hours ago

KathanS commented 2 weeks ago

For fixing CRI: Incident-558584096 Details - IcM

In the current design, we face the issue of connection getting disposed, when it is needed by feature flag service inside PublishAsync method in publisher. I am switching to get feature flag states and setting boolean accordingly during initialization in InitializePublisher method. This will improve the design as we were earlier making multiple calls for same feature flag and should also potentially fix the issue as during initialization we see feature flag service used to get state in other places does not throw warning currently.

Testing Screenshot

image

Warning screenshot

image

I am not able to repro the error locally, in the error cases we currently receive error logs as per below as shared in the CRI.

1.

[warning]Failed to get FF TestManagement.Agent.PTR.EnableFlakyCheck Value. Error: System.AggregateException: One or more errors occurred. (A task was canceled.)

---> System.Threading.Tasks.TaskCanceledException: A task was canceled. at System.Threading.Tasks.Task.GetExceptions(Boolean includeTaskCanceledExceptions) at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions) at System.Threading.Tasks.Task1.GetResultCore(Boolean waitCompletionNotification) at System.Threading.Tasks.Task1.get_Result() at Microsoft.VisualStudio.Services.Agent.Worker.TestResults.Utils.FeatureFlagService.GetFeatureFlagState(String featureFlagName, Guid serviceInstanceId) in D:\a_work\1\s\src\Agent.Worker\TestResults\Utils\FeatureFlagService.cs:line 39 at Microsoft.VisualStudio.Services.Agent.Worker.TestResults.TestDataPublisher.PublishAsync(TestRunContext runContext, List1 testResultFiles, PublishOptions publishOptions, CancellationToken cancellationToken) in D:\a\_work\1\s\src\Agent.Worker\TestResults\TestDataPublisher.cs:line 101 at System.Runtime.CompilerServices.AsyncTaskMethodBuilder1.AsyncStateMachineBox1.ExecutionContextCallback(Object s) at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state) at System.Runtime.CompilerServices.AsyncTaskMethodBuilder1.AsyncStateMachineBox1.MoveNext(Thread threadPoolThread) at System.Runtime.CompilerServices.AsyncTaskMethodBuilder1.AsyncStateMachineBox1.MoveNext() at System.Threading.Tasks.AwaitTaskContinuation.RunOrScheduleAction(IAsyncStateMachineBox box, Boolean allowInlining) at System.Threading.Tasks.Task.RunContinuations(Object continuationObject) at System.Threading.Tasks.Task.TrySetResult() at System.Threading.Tasks.Task.WhenAllPromise.Invoke(Task completedTask) at System.Threading.Tasks.Task.RunContinuations(Object continuationObject) at System.Threading.Tasks.Task1.TrySetResult(TResult result) at System.Threading.Tasks.UnwrapPromise1.TrySetFromTask(Task task, Boolean lookForOce) at System.Threading.Tasks.UnwrapPromise1.Invoke(Task completingTask) at System.Threading.Tasks.Task.RunContinuations(Object continuationObject) at System.Runtime.CompilerServices.AsyncTaskMethodBuilder1.SetExistingTaskResult(Task1 task, TResult result) at Microsoft.TeamFoundation.TestClient.PublishTestResults.TestRunPublisher.PublishTestRunDataAsync(TestRunContext runContext, String projectName, IList1 testRuns, PublishOptions publishOptions, CancellationToken cancellationToken) at System.Runtime.CompilerServices.AsyncTaskMethodBuilder1.AsyncStateMachineBox1.ExecutionContextCallback(Object s) at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state) at System.Runtime.CompilerServices.AsyncTaskMethodBuilder1.AsyncStateMachineBox1.MoveNext(Thread threadPoolThread) at System.Runtime.CompilerServices.AsyncTaskMethodBuilder1.AsyncStateMachineBox`1.MoveNext() at System.Threading.Tasks.AwaitTaskContinuation.RunOrScheduleAction(IAsyncStateMachineBox box, Boolean allowInlining) ... at System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32 errorCode, UInt32 numBytes, NativeOverlapped* pNativeOverlapped) --- End of stack trace from previous location ---

--- End of inner exception stack trace --- at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions) at System.Threading.Tasks.Task1.GetResultCore(Boolean waitCompletionNotification) at System.Threading.Tasks.Task1.get_Result() at Microsoft.VisualStudio.Services.Agent.Worker.TestResults.Utils.FeatureFlagService.GetFeatureFlagState(String featureFlagName, Guid serviceInstanceId) in D:\a_work\1\s\src\Agent.Worker\TestResults\Utils\FeatureFlagService.cs:line 39

2.

[warning]Failed to get FF TestManagement.PTR.CalculateTestRunSummary Value. Error: System.ObjectDisposedException: Cannot access a disposed object.

Object name: 'VssConnection'. at Microsoft.VisualStudio.Services.WebApi.VssConnection.CheckForDisposed() at Microsoft.VisualStudio.Services.WebApi.VssConnection.GetClientServiceImplAsync(Type requestedType, Guid serviceIdentifier, Func4 getInstanceAsync, CancellationToken cancellationToken) at Microsoft.VisualStudio.Services.WebApi.VssConnection.GetClientAsync[T](Guid serviceIdentifier, CancellationToken cancellationToken) at Microsoft.VisualStudio.Services.WebApi.TaskExtensions.SyncResult[T](Task1 task) at Microsoft.VisualStudio.Services.WebApi.VssConnection.GetClient[T](Guid serviceIdentifier) at Microsoft.VisualStudio.Services.Agent.Worker.TestResults.Utils.FeatureFlagService.GetFeatureFlagState(String featureFlagName, Guid serviceInstanceId

dougbu commented 1 week ago

I am removing 'Using' block and doing disposal manually in the 'finally' block as 'finally' block will only be called after all async operations in try block are done.

@jaredpar do you think try / finally is going to change anything compared to a using block containing unchanged async / await code❓ if not you, who would know❓

jaredpar commented 1 week ago

@jaredpar do you think try / finally is going to change anything compared to a using block containing unchanged async / await code❓ if not you, who would know❓

No. The new code is virtually identical to the old code.

KathanS commented 1 week ago

@jaredpar do you think try / finally is going to change anything compared to a using block containing unchanged async / await code❓ if not you, who would know❓

No. The new code is virtually identical to the old code.

@dougbu / @jaredpar, One doubt for my understanding, if catch or finally block runs before async calls of try block complete execution, then how do we handle the case of exceptions in async call in try block? I mean exception can arise in async call made in try block and then we do need to run catch block and then finally block, right, so should not finally block wait till async calls done in try block are completed to catch exceptions if any

jaredpar commented 1 week ago

, if catch or finally block runs before async calls of try block complete execution,

Can you elaborate on this? Basically provide a try / finally block with async and what you want / or believe happens?

KathanS commented 6 days ago

, if catch or finally block runs before async calls of try block complete execution,

Can you elaborate on this? Basically provide a try / finally block with async and what you want / or believe happens?

Thank you @jaredpar,

    try
    {
        someFunction();
    }
    catch
    {
        Console.WriteLine("Inside Catch");
    }
    finally
    {
        Console.WriteLine("Inside Finally");
    }

this code snippet should output as per below,

Case 1) someFunction does not throw error

Output:

Inside Finally

Case 2) someFunction throws error

Output:

Inside Catch Inside Finally

This should be true regardless of whether someFunction() is async or not, right? and if this is the case 'catch' and 'finally' block has to wait till async function call (used with 'await' also) inside 'try' is completed execution as till execution is not completed, we can not say whether it has thrown error or not

Please let me know if I am missing something here

jaredpar commented 6 days ago

This should be true regardless of whether someFunction() is async or not, right?

Correct.

and if this is the case 'catch' and 'finally' block has to wait till async function call (used with 'await' also) inside 'try' is completed execution as till execution is not completed, we can not say whether it has thrown error or not

In the case the body of the try is await someFunction() then yes the catch / finally won't come into play until the Task returned from someFunction completes

KathanS commented 6 days ago

This should be true regardless of whether someFunction() is async or not, right?

Correct.

and if this is the case 'catch' and 'finally' block has to wait till async function call (used with 'await' also) inside 'try' is completed execution as till execution is not completed, we can not say whether it has thrown error or not

In the case the body of the try is await someFunction() then yes the catch / finally won't come into play until the Task returned from someFunction completes

Thank you @jaredpar for confirming.

I am switching to use 'try' and 'finally' to achieve the same effect, I am making 'await publisher.PublishAsync' call inside 'try' block and connection block will be disposed in 'finally' block only after the async call completes the execution.

(Earlier, async call was inside Using block - 'using (var connection = WorkerUtilities.GetVssConnection(executionContext))'. This created issue that async call is still running and control goes out of using block and thus the connection object is disposed and we get this error in async call - 'Cannot access a disposed object. Object name: 'VssConnection''_.)

Hence, switching to 'try' and 'finally' will solve the issue.

jaredpar commented 6 days ago

Hence, switching to 'try' and 'finally' will solve the issue

I'm not following this. The before / after code is effectively identical. There is no change in behavior here.

KathanS commented 6 days ago

Hence, switching to 'try' and 'finally' will solve the issue

I'm not following this. The before / after code is effectively identical. There is no change in behavior here.

We get Cannot access a disposed object. Object name: 'VssConnection' in the current approach with 'using' due to auto disposal of connection object. I believe switching to do manual disposal with 'try' and 'finally' block solves the issue as disposal will only happen in the finally block after async calls are completed.

Thank you for creating teams chat, let me check the IL and discuss there