neo4j / neo4j-dotnet-driver

Neo4j Bolt driver for .NET
Apache License 2.0
231 stars 69 forks source link

ExecuteAsync does not abort execution when the given cancellationtoken is canceled. #762

Closed peter-villadsen closed 5 months ago

peter-villadsen commented 9 months ago

Describe the bug The ExecuteAsync method on the IExecutableQuery interface does not terminate the query when the token is cancelled. Please see the sample below for a demonstration of this issue.

To Reproduce The repro has two steps: first the mechanics for the repro by using standard .NET features are described. Then the neo4j bits are introduced to show the problem.

Please consider this snippet:

        static async Task Main(string[] args)
        {
            var tokenSource = new CancellationTokenSource();

            // Set up the timer to cancel the token after 10 seconds.
            Timer timer = new Timer(_ => {
                Console.WriteLine("Timed out - Cancelling...");
                tokenSource.Cancel();
            }, null, 10 * 1000, Timeout.Infinite);

            try 
            {
               await Task.Delay(20 * 1000, tokenSource.Token);
            }
            catch (Exception ex)
            {
                Console.WriteLine("Got exception: " + ex.GetType().FullName);
                Console.WriteLine(ex.Message);
            }
        }

As you can see, the program is a console app (running under ,NET 8). A timer is set up to fire after 10 seconds. A cancellationTokenSource is created, and it is cancelled when the timer fires. The token from this cancellationTokenSource is passed to the call Task.Delay() call, which runs for more than the 10 seconds. When it fires, the Delay method notices, and throws a TaskCanceledException exception.

Now some Neo4j relevant things are added to this template. Instead of just delaying, code is inserted to search for something in the database. The repro is complicated by the fact that a long-running query is needed to demonstrate the problem. Feel free to tweak the timing parameters to suit your needs.

The final demo looks like this:

using Neo4j.Driver;
using System.Threading;

namespace TestCancellationToken
{
    internal class Program
    {
        static async Task Main(string[] args)
        {
            var username = "neo4j";
            var password = "neo4jtest";
            var server = "localhost";
            var port = 7687;
            var protocol = "bolt";

            // This is a long-running query. For the sake of the example, I've set the timeout to 10 seconds.
            // so this query MUST take longer than 10 seconds to run.
            var query = "MATCH (c:Method) -[r:CALLS*]-(m:Method) RETURN *";

            // This is the cancellation token that will be passed to the ExecuteAsync call. It will
            // be cancelled after 10 seconds.
            var tokenSource = new CancellationTokenSource();

            // Set up the timer to cancel the token after 10 seconds.
            Timer timer = new Timer(_ => {
                Console.WriteLine("Timed out - Cancelling...");
                tokenSource.Cancel();
            }, null, 10000, Timeout.Infinite);

            // Create the driver and execute the query...
            IDriver driver = GraphDatabase.Driver($"{protocol}://{server}:{port}", AuthTokens.Basic(username, password));

            try
            {
                var eq = driver.ExecutableQuery(query);

                // Pass the cancellation token to the ExecuteAsync call. The query should
                // end shortly after the token is cancelled by the timer.
                EagerResult<IReadOnlyList<IRecord>> e = await eq.ExecuteAsync(tokenSource.Token);

                var results = e.Result.ToList();
            }
            catch (Exception ex)
            {
                Console.WriteLine("Got exception: " + ex.GetType().FullName);
                Console.WriteLine(ex.Message);
            }
        }
    }
}

The sample just accesses data with the long-running query (that has to run for more than 10 seconds in the code above). The issue in this bug is that the execution of the ExecuteAsync call does not in fact cause the query execution to end.

Expected behavior The expected behavior is that when the token is Cancelled (which is done when the timer runs out), the execution of the query should be aborted. This is what is described in the documentation comments for that method:

    //
    // Summary:
    //     Executes the query.
    //
    // Parameters:
    //   token:
    //     A cancellation token that can be used to cancel the operation.
    //
    // Returns:
    //     An Neo4j.Driver.EagerResult`1That contains the result of the query and information
    //     about the execution.
    Task<EagerResult<IReadOnlyList<TOut>>> ExecuteAsync(CancellationToken token = default(CancellationToken));
}

Version Info (please complete the following information):

Additional context Please advise about any supported ways of stopping a rogue query. I realize that it is possible to execute TERMINATE TRANSACTION , but that is clunky and the id can only be fetched through a SHOW TRANSACTIONS call, with appropriate filtering to determine the right query.

The code in the example is enclosed. Program.zip

thelonelyvulpes commented 9 months ago

Hi @peter-villadsen, thanks for bringing this up, will post an update when fixed.

peter-villadsen commented 8 months ago

RichardIrons,

I tested the repro in the bug with a version of the Neo4j.Driver.Simple dll with the fix that has been merged into 5.0. I can see that the cancellationToken makes it to the place that has changed, but the system still does not react to the cancellationToken being canceled - No exception is being thrown. I notice that you have not closed out the bug yet, so there may be something I am missing?

The pull request does not add any tests for this scenario, so I only have my own test (in the bug) to work from.

Please advise, since I have a product I would like to ship with this feature.

thelonelyvulpes commented 8 months ago

@peter-villadsen Hi, the change was deployed on Monday to 5.17.0, it should allow the the driver to cancel the query, there is a known issue with supporting cancellation tokens in the driver.

The change @RichardIrons-neo4j implemented should allow for the query to be cancelled. Though the cancellation can only occur between each record being returned. I hope we can pick up full cancellation token support soon.

Apologies on the delay getting back to you about this.