bchavez / RethinkDb.Driver

:headphones: A NoSQL C#/.NET RethinkDB database driver with 100% ReQL API coverage.
http://rethinkdb.com/api/java
Other
383 stars 134 forks source link

Additional exceptions thrown when trying to cancel Cursor.MoveNextAsync() #146

Closed EvanSchiewe-Entegra closed 4 years ago

EvanSchiewe-Entegra commented 4 years ago

Version Information

Software Version(s)
NuGet Package 2.3.101
.NET Full Framework? 4.8
Driver Windows OS? 1903 Build 18980.1
RethinkDB Server 2.3.6
Server Linux OS? Official docker image (Debian Jessie)

What is the expected behavior?

I expect to be able to pass a CancellationToken to any async method, cancel it, handle the OperationCanceledException, and continue without issues.

What is the actual behavior?

Instead, I am seeing an internal exception from RethinkDb.Driver.Utils.CancellableTask.OnCancellation() bubble up where it is trying to call SetCanceled() after it has already been canceled.

Any possible solutions?

Use TrySetCanceled instead of SetCanceled in CancellableTask.OnCancellation() would stop the exception from being thrown, but I'm not sure why it's being thrown in the first place.

How do you reproduce the issue?

This should be a fairly simple console app that reproduces the issue I'm experiencing. If you see anything strange or wrong that I'm doing please let me know!

Once the token is cancelled, RunChanges catches the OperationCanceledException and exits normally. However, I end up catching an exception from RethinkDb.Driver.Utils.CancelleableTask that I don't believe should be raised under normal usage.

static RethinkDB R = RethinkDB.R;
static Connection conn;

async Task Main()
{
    conn = R.Connection().Connect();

    var cts = new CancellationTokenSource();
    RunChanges(cts.Token);

    try
        {
        await Task.Delay(500);
        Console.WriteLine("Cancelling task");
        cts.Cancel();
        Console.WriteLine("Task Cancelled");
    }
    catch (Exception ex) 
        { 
        Console.WriteLine(ex); 
    }
    Console.WriteLine("End of main");
}

async void RunChanges(CancellationToken ct) 
{
    Console.WriteLine("RunChanges: called");
    Cursor<RethinkDb.Driver.Model.Change<JObject>> changes = null;
    try 
        {
        changes = await R.Db("rethinkdb").Table("jobs")
                        .Changes().OptArg("include_initial", "true")
                        .RunChangesAsync<JObject>(conn, ct);

        while (await changes.MoveNextAsync(ct)) 
                {
            Console.WriteLine("RunChanges: got a change");
        }
    }
    catch(OperationCanceledException) 
        {
        Console.WriteLine("RunChanges: op canceled");
    }
    finally 
        {
        Console.WriteLine("RunChanges: finally");
        changes?.Close();
        Console.WriteLine("RunChanges: changes cursor closed");
    }
    Console.WriteLine("RunChanges: returning");
}

Can you identify the location in the driver source code where the problem exists?

Unclear - see above

If the bug is confirmed, would you be willing to submit a PR?

Yes

bchavez commented 4 years ago

Thank you very much for the detailed information. Give me a chance to review the information and I'll get back to you.

bchavez commented 4 years ago

Hi @EvanSchiewe-Entegra,

So, after some study, I think the issue arises because the CancellationToken is used twice. See below:

async void RunChanges(CancellationToken ct) 
{
   Console.WriteLine("RunChanges: called");
   Cursor<RethinkDb.Driver.Model.Change<JObject>> changes = null;
   try 
   {
      changes = await R.Db("rethinkdb").Table("jobs")
                      .Changes().OptArg("include_initial", "true")
                      .RunChangesAsync<JObject>(conn, ct); <<<-- ONCE HERE

      while (await changes.MoveNextAsync(ct)) <<<-- TWICE HERE
      {
         Console.WriteLine("RunChanges: got a change");
      }
   }
   catch(OperationCanceledException) 
   {
      Console.WriteLine("RunChanges: op canceled");
   }
   finally 
   {
      Console.WriteLine("RunChanges: finally");
      changes?.Close();
      Console.WriteLine("RunChanges: changes cursor closed");
   }
   Console.WriteLine("RunChanges: returning");
}

As far as I can tell, here's what happens:

System.InvalidOperationException: An attempt was made to transition a task to a final state when it had already completed.

System.AggregateException: One or more errors occurred. ---> System.InvalidOperationException: An attempt was made to transition a task to a final state when it had already completed.
   at System.Threading.Tasks.TaskCompletionSource`1.SetCanceled()
   at RethinkDb.Driver.Utils.CancellableTask.OnCancellation() in C:\Code\Projects\Public\RethinkDb.Driver\Source\RethinkDb.Driver\Utils\CancellableTask.cs:line 26
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.CancellationCallbackInfo.ExecuteCallback()
   at System.Threading.CancellationTokenSource.ExecuteCallbackHandlers(Boolean throwOnFirstException)
   --- End of inner exception stack trace ---
   at System.Threading.CancellationTokenSource.ExecuteCallbackHandlers(Boolean throwOnFirstException)
   at System.Threading.CancellationTokenSource.NotifyCancellation(Boolean throwOnFirstException)
   at System.Threading.CancellationTokenSource.Cancel()
   at RethinkDb.Driver.Tests.GitHubIssues.Issue146.<Test>d__1.MoveNext() in C:\Code\Projects\Public\RethinkDb.Driver\Source\RethinkDb.Driver.Tests\GitHubIssues\Issue146.cs:line 34
---> (Inner Exception #0) System.InvalidOperationException: An attempt was made to transition a task to a final state when it had already completed.
   at System.Threading.Tasks.TaskCompletionSource`1.SetCanceled()
   at RethinkDb.Driver.Utils.CancellableTask.OnCancellation() in C:\Code\Projects\Public\RethinkDb.Driver\Source\RethinkDb.Driver\Utils\CancellableTask.cs:line 26
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.CancellationCallbackInfo.ExecuteCallback()
   at System.Threading.CancellationTokenSource.ExecuteCallbackHandlers(Boolean throwOnFirstException)<---

Suffice to say, if you just use ct once, the problem should go away. The code below removes the ct parameter from .RunChangesAsync(conn):

changes = await R.Db("rethinkdb").Table("jobs")
                .Changes().OptArg("include_initial", "true")
                .RunChangesAsync<JObject>(conn); <<<-- REMOVE ct HERE

while (await changes.MoveNextAsync(ct)) <<<-- KEEP ct HERE
{
   Console.WriteLine("RunChanges: got a change");
}

I'm not sure what the best practice should be here, but my gut feeling is you should be able to pass ct around without being signaled in as many places as you need. We'll probably have to figure out how to clean up those already completed tasks in Awatiers and figure out how to unregister cancelToken.Register(OnCancellation registrations once .TrySetResult(response) has been called.

If you have any feedback, suggestions or ideas, please let me know your thoughts.

Thanks, Brian

EvanSchiewe-Entegra commented 4 years ago

Thanks for the detailed response! That is a very easy to follow analysis.

my gut feeling is you should be able to pass ct around without being signaled in as many places as you need

I do agree with you here. It seems like Microsoft does too, at least based on their code samples.

I can't say how often this use case is truly necessary in Rethink client code. In my test case, it doesn't make a big difference if I have to wait for the relatively short RunChangesAsync to finish before the indefinitely looping changefeed receives the cancellation. I don't think this issue will cause any pain now that I know how to avoid it.

I'm afraid I won't be of much help actually coming up with a clean solution for this, but please do let me know if there's anything I can do to help.

Thanks again for your time!

bchavez commented 4 years ago

No problem. Thank you for reporting the issue. After some time sleeping on the problem, I think I have a decent solution that will fix the issue.

I should have a new release later today. However, I just want to take a little more time today to look at Microsoft's CoreFX sources dotnet/corefx and see if I can find better disposable registration CancellationToken patterns. I skimmed through a few books yesterday and they weren't that much help.

bchavez commented 4 years ago

Hi @EvanSchiewe-Entegra,

A new version of RethinkDb.Driver v2.3.150 is now released that contains the fix.

Let me know if it works for you.

Thanks, Brian

EvanSchiewe-Entegra commented 4 years ago

Looks great to me!

Thanks, Evan