Azure / azure-sdk-for-java

This repository is for active development of the Azure SDK for Java. For consumers of the SDK we recommend visiting our public developer docs at https://docs.microsoft.com/java/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-java.
MIT License
2.26k stars 1.93k forks source link

[BUG] Netty HTTP Server exception for There is a blocking call in the CosmosClientAsync #40051

Open alejojarahi opened 2 months ago

alejojarahi commented 2 months ago

Describe the bug They do a dynamic lookup on the warmup function in the the Netty HttpClient and then invokes this. They do this by blocking.

In the cosmos library this warmup call does not seem to be configurable.

Exception or Stack Trace

java.lang.RuntimeException: java.lang.IllegalStateException: block()/blockFirst()/blockLast() are blocking, which is not supported in thread default-nioEventLoopGroup-1-3
    at com.azure.cosmos.implementation.http.ReactorNettyClient.attemptToWarmupHttpClient(ReactorNettyClient.java:116)
    at com.azure.cosmos.implementation.http.ReactorNettyClient.createWithConnectionProvider(ReactorNettyClient.java:96)
    at com.azure.cosmos.implementation.http.HttpClient.createFixed(HttpClient.java:62)
    at com.azure.cosmos.implementation.RxDocumentClientImpl.httpClient(RxDocumentClientImpl.java:744)
    at com.azure.cosmos.implementation.RxDocumentClientImpl.<init>(RxDocumentClientImpl.java:508)
    at com.azure.cosmos.implementation.RxDocumentClientImpl.<init>(RxDocumentClientImpl.java:335)
    at com.azure.cosmos.implementation.RxDocumentClientImpl.<init>(RxDocumentClientImpl.java:295)
    at com.azure.cosmos.implementation.AsyncDocumentClient$Builder.build(AsyncDocumentClient.java:275)
    at com.azure.cosmos.CosmosAsyncClient.<init>(CosmosAsyncClient.java:170)
    at com.azure.cosmos.CosmosClientBuilder.buildAsyncClient(CosmosClientBuilder.java:1084)
    at com.azure.cosmos.CosmosClientBuilder.buildAsyncClient(CosmosClientBuilder.java:1071)
    at io.micronaut.azure.cosmos.client.CosmosClientFactory.buildCosmosAsyncClient(CosmosClientFactory.java:59)

To Reproduce

  1. Add implementation 'com.azure:azure-cosmos:4.57.0' in Java Gradle
  2. Use Netty HTTP Server
  3. Use Reactor
  4. In the Framework (Micronaut 4 or Springboot 3) with Netty HTTP Server, create connection in request context. For example create connection with Singleton Pattern.
  5. Request to Repository.

Code Snippet

import com.azure.cosmos.CosmosAsyncDatabase;
import com.azure.cosmos.CosmosClientBuilder;
import jakarta.inject.Singleton;

@Singleton
public class UserRepository {

    private static final String DATABASE_NAME = "test";

    CosmosAsyncDatabase cosmosAsyncDatabase = null;

    public CosmosAsyncDatabase getInstance() {
        if(cosmosAsyncDatabase == null) {
            cosmosAsyncDatabase = new CosmosClientBuilder()
                    .endpoint("String Connection")
                    .key("Key Connection")
                    .buildAsyncClient()
                    .getDatabase(DATABASE_NAME);
        }
        return cosmosAsyncDatabase;
    }

    public Flux<User> getAll() {
        return getInstance().getContainer("users")
                            .queryItems("SELECT * FROM users", User.class);
    }
}

In Springboot or Micronaut Controller

private final ClienteRepository clienteRepository = new ClienteRepository();

@GetMapping("/all")
public Flux<UsuarioDto> getAll() {
    return clienteRepository.getAll();
}

In Micronaut Framework there is a setting that also leads to the same problem https://github.com/micronaut-projects/micronaut-azure/blob/5.5.x/azure-cosmos/src/main/java/io/micronaut/azure/cosmos/client/CosmosClientFactory.java

Expected behavior The connection to the database is created and the query can be executed without any exception or thread blocking

Setup (please complete the following information):

Additional context On stackoverflow they expand a little more on the same error that is presented in actual version https://stackoverflow.com/questions/73412719/create-recreate-azure-cosmosdb-async-client-from-java-reactor-context

Information Checklist Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report

github-actions[bot] commented 2 months ago

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @kushagraThapar @pjohari-ms @TheovanKraay.

kushagraThapar commented 2 months ago

thanks @alejojarahi for creating this issue, looks like a genuine problem in case of async client creation. @alzimmermsft I am curious, if this is the same case in azure-core as well and have we solved it? I would like to keep the solutions consistent if possible. Referencing the PR where we added this, will be good to re-iterate our approach to warming up http clients in case of async clients.

kushagraThapar commented 2 months ago

IMO, we should remove the blocking warmup of the http client. cc: @xinlian12 / @FabianMeiswinkel

alzimmermsft commented 2 months ago

thanks @alejojarahi for creating this issue, looks like a genuine problem in case of async client creation. @alzimmermsft I am curious, if this is the same case in azure-core as well and have we solved it? I would like to keep the solutions consistent if possible. Referencing the PR where we added this, will be good to re-iterate our approach to warming up http clients in case of async clients.

azure-core-http-netty doesn't perform a warmup of the Netty HttpClient

kushagraThapar commented 2 months ago

thanks @alzimmermsft for the confirmation. @alejojarahi - actually the issue here is not the warmup, rather the way you are creating the cosmos client. The code snippet you pasted in the issue description is creating multiple cosmos clients (per database) which is not recommended. We strongly recommend to only create a singleton client.

azure-cosmos client or any other client if needs to warmup, they will need to call certain APIs that will allow the client to warmup. Ideally, you would want to have a warmed-up client and then only enable the traffic on it (through nio threads or whatever). You would want to warmup the client on the main thread before opening it up for traffic.

If your ask is to make this configurable, we can think about doing so.

alejojarahi commented 2 months ago

Hi @kushagraThapar. Thanks for the reply.

This error also occurs in the scenario that CosmosAsyncClient is instantiated only onces. Example in Springboot 3

import com.azure.cosmos.CosmosAsyncClient;
import com.azure.cosmos.CosmosClientBuilder;
import reactor.core.publisher.Flux;

public class UserRepository {
    private static final String DATABASE_NAME = "test";

    CosmosAsyncClient cosmosAsyncClient = null;

    public CosmosAsyncClient getInstance() {
        if(cosmosAsyncClient == null) {
            cosmosAsyncClient = new CosmosClientBuilder()
                    .endpoint("ENDPOINT")
                    .key("KEY")
                    .buildAsyncClient();
        }
        return cosmosAsyncClient;
    }

    public Flux<User> getAll() {
        return getInstance().getDatabase(DATABASE_NAME)
                            .getContainer("users")
                            .queryItems("SELECT * FROM users", User.class);
    }
}
17:44:54.516 Data 06-may.-2024 | ERROR | Thread [reactor-http-nio-3] | c.a.c.i.RxDocumentClientImpl | unexpected failure in initializing client.
java.lang.RuntimeException: java.lang.IllegalStateException: block()/blockFirst()/blockLast() are blocking, which is not supported in thread reactor-http-nio-3
    at com.azure.cosmos.implementation.http.ReactorNettyClient.attemptToWarmupHttpClient(ReactorNettyClient.java:116)
    at com.azure.cosmos.implementation.http.ReactorNettyClient.createWithConnectionProvider(ReactorNettyClient.java:96)
    at com.azure.cosmos.implementation.http.HttpClient.createFixed(HttpClient.java:62)
    at com.azure.cosmos.implementation.RxDocumentClientImpl.httpClient(RxDocumentClientImpl.java:744)
    at com.azure.cosmos.implementation.RxDocumentClientImpl.<init>(RxDocumentClientImpl.java:508)
    at com.azure.cosmos.implementation.RxDocumentClientImpl.<init>(RxDocumentClientImpl.java:335)
    at com.azure.cosmos.implementation.RxDocumentClientImpl.<init>(RxDocumentClientImpl.java:295)
    at com.azure.cosmos.implementation.AsyncDocumentClient$Builder.build(AsyncDocumentClient.java:275)
    at com.azure.cosmos.CosmosAsyncClient.<init>(CosmosAsyncClient.java:170)
    at com.azure.cosmos.CosmosClientBuilder.buildAsyncClient(CosmosClientBuilder.java:1084)
    at com.azure.cosmos.CosmosClientBuilder.buildAsyncClient(CosmosClientBuilder.java:1071)
    at co.com.springexample.entrypoint.UserRepository.getInstance(UserRepository.java:19)

If the reactive client is subject to only being able to start by on the main thread, this can limit or affect the architecture of a project. For example this use case https://stackoverflow.com/questions/73412719/create-recreate-azure-cosmosdb-async-client-from-java-reactor-context.

If the team does not consider it a bug, it should be a configurable functionality

kushagraThapar commented 2 months ago

@alejojarahi the stack overflow code snippet you mentioned is also not a valid use case of the cosmos client, because the customer is trying to create a new cosmos client every time they need to rotate their keys.

SecretAsyncClient secretAsyncClient = new SecretClientBuilder().buildAsyncClient()
...
Mono<CosmosAsyncClient> client = secretAsyncClient.getSecret(KEY_NAME).map(
  s -> s.getValue()
).map(
  key -> new CosmosClientBuilder()
                        .endpoint(HOST)
                        .key(key)
                        .buildAsyncClient()
);
return client.flatMap(
   ...
);

Instead they should use a key rotation pattern like some form of KeyCredential object which has an update() API to update the key on the fly.

Creating clients randomly and during the execution of the application is never the right approach, as client creation is considered a heavy operation, specially in case of Cosmos SDK where it needs to get Database account, metadata, fill the caches, create TCP / HTTP connections, etc. when the client boots up. Any application which is doing this on nio thread is basically blocking their I/O resources for unnecessary client creation.