googleapis / java-spanner

Apache License 2.0
64 stars 114 forks source link

Spanner: AsyncRunner.runAsync blocks executor; deadlock possible #2698

Open zobar opened 1 year ago

zobar commented 1 year ago

AsyncRunner.runAsync performs a blocking get inside its executor. This may result in deadlock if the following conditions are met:

This is most likely to manifest itself if the application has a constrained global thread pool optimized for running non-blocking operations. In particular, we noticed this in a Cats Effect application, although the bug is not restricted to Scala or Cats Effect. This may also be causing the performance problems noted in #1751.

Our workaround is to provide a dedicated thread pool that is only used for runAsync, but which is not used by the work itself.

I've built a minimum test case in Java to demonstrate this issue.

Environment details

OS type and version: Mac OS Ventura 13.6 Java version: OpenJDK 21 Versions: com.google.cloud:google-cloud-spanner:6.52.1

Steps to reproduce

  1. Clone my GitHub repo for a minimized test case.
  2. Run 15 concurrent transactions on a maximum of 16 threads. This will work as expected.
     $ ./gradlew run --args="16 15 my-project my-instance my-database"
     Getting results...
     Successfully ran 15 transactions in parallel on 16 threads.
     $
  3. Try running 16 concurrent transactions on a maximum of 16 threads. This will hang.
     $ ./gradlew run --args="16 16 my-project my-instance my-database"
     Getting results...
     ^C

Code example

public int run() throws ExecutionException, InterruptedException {
    SettableApiFuture trigger = SettableApiFuture.create();

    Iterable transactions = Stream
            .generate(() -> databaseClient
                    .runAsync()
                    .runAsync(work(trigger), threadPool))
            .limit(concurrency)
            .toList();
    ApiFuture<List> results = ApiFutures.allAsList(transactions);

    trigger.set(null);

    System.out.println("Getting results...");
    return results.get().size();
}

public AsyncRunner.AsyncWork work(ApiFuture trigger) {
    return (txn) -> ApiFutures.transform(trigger, (input) -> input, threadPool);
}

External references such as API reference guides

The API documentation for runAsync does not mention that it runs blocking operations in the executor.

zobar commented 1 year ago

On closer examination, this ticket is not related to #1751. I've confirmed that the default AsyncExecutorProvider has poor performance compared to the synchronous API, but by providing a larger thread pool, it is possible to achieve better performance than the synchronous API.

surbhigarg92 commented 1 year ago

@arpan14 Can you please help triage this issue.

SanjayVas commented 3 weeks ago

I believe the workaround is to use AsyncTransactionManager directly. That appears to actually be asynchronous (i.e. not just doing blocking calls in the executor).

I'm surprised that AsyncRunner wraps the blocking TransactionManager rather than wrapping AsyncTransactionManager.